Estimation of absolute or relative acoustic transfer functions (AATFs or RATFs) is important for a proper function of algorithms in hearing devices, e.g. in order to adapt a beamformer towards a signal source of interest. EP3413589A1 describes a “dictionary method” for estimating relative acoustic transfer functions.
In the present disclosure, an extension of the concept of EP3413589A1 is proposed, which may a) lead to better estimation performance, b) reduce computational complexity when searching the dictionary, and c) derive information about hearing aid users' head characteristics and/or position of the hearing aid on the ear of the user (which may be beneficial for other algorithms, e.g. microphone position equalization, etc.).
A First Hearing Aid System:
In an aspect of the present application, a first hearing aid system comprising a hearing aid configured to be worn on the head at or in an ear of a user is provided. The hearing aid may comprise
The hearing aid system may comprise
The processor may be configured to determine personalized ATF-vectors ATF*θ*,φ*,p′ for said user based on said database Θ, said electric input signals xm(n), m=1, . . . , M, and said model of the acoustic propagation channels.
Thereby an improved hearing aid system may be provided.
In the term ‘direction to or location’ the term ‘direction’ is taken to mean different locations with the same angle relative to the hearing aid user (e.g. a direction of the nose) (which may lead to roughly same acoustic transfer functions (ATF), if at sufficient distance to user).
The personalized ATF-vector ATF*θ*,φ*,p′ for the given acoustic situation (defined by the given electric input signals picked up by the microphone system) may thus comprise frequency dependent (k=1, . . . , K) acoustic transfer functions for each microphone (m=1, . . . , M) from the dictionary for a given (natural or artificial) person p′ (for which the dictionary Δp′ has been created) for a specific direction/location θ* to the target sound source and for a specific hearing aid orientation φ*.
The number (J) of specific directions/locations (θj) of the sound source relative to the microphones of the hearing aid (for each of which an acoustic transfer function may be present in the dictionary Δp′ of the database Θ) may be any number larger than or equal to two, e.g. in the range between two and twenty-four, e.g. between four and sixteen.
The number (Q) of specific hearing aid orientations (φq) on the user's head (for each of which an acoustic transfer function may be present in the dictionary Δp′ of the database Θ) may be any number larger than or equal to two, e.g. in the range between two and eight, e.g. three or four.
The processor may thus be configured to determine (e.g. select) a personalized ATF-vector ATF*θ*,φ*,p′ for said user (from the dictionary Δp′ of the database Θ) for a given acoustic situation defined by the electric input signals xm, m=1, . . . , M, the vector ATF*θ*,φ*,p′ comprising elements ATF*m(θ*,φ*,p′,k), m=1, . . . , M, i.e. one for each microphone. The personalized ATF-vector ATF*θ*,φ*,p′ is determined from the dictionary Δp′ for a given natural or artificial person p′ and may thus provide information about an estimated direction/location θ* to the target sound source, and an estimate of the current orientation φ* of the hearing aid in question on the user's head. This information may be used as inputs to other parts of the hearing aid, e.g. to a signal processor for applying processing algorithms to one or more signals of the forward path, e.g. to a beamforming algorithm (see e.g. ATF*, p*, θ*, φ* as inputs to the processor (SP) in
A direction to or a location (θ) of a target sound source relative to the hearing aid (or more specifically, to the microphone system) may e.g. be to a reference microphone (m=i) among the M microphones of the microphone system (see e.g.
The multitude of M of microphones may be arranged in a predefined geometric configuration. The term ‘a predefined geometric configuration’ may e.g. include that the microphones are located in a housing of a hearing aid (e.g. in a BTE part or in an ITE-part, or in a BTE-part and in an ITE-part with a relatively constant mutual arrangement provided by a connecting element (e.g. a tube or a cable adapted to the user's physiognomy) when mounted on the user).
The term ‘a natural or artificial person’ may be understood to cover real persons as well as artificial persons (e.g. based on physical or computer models of human head and torso), or prototypes of such persons. A ‘natural person’ refers to a human being, e.g. (but typically not) including the user. An ‘artificial person’ may e.g. be a model, e.g. the Head and Torso Simulator (HATS) 4128C from Briiel & Kjer Sound & Vibration Measurement A/S, or the head and torso model KEMAR from GRAS Sound and Vibration A/S, or similar, e.g. a computer model of the acoustic propagation properties of a person. An ‘artificial person’ may e.g. be characterized by “average head characteristics”, e.g. arising by combining several measured characteristics of real humans. Such “average head characteristics” or “prototype heads” may be useful, when one tries to represent as many real heads as possible with as small database as possible a small database is desirable for computational complexity (search) reasons and memory complexity (storage) reasons (e.g. for use in a low-power device, such as a hearing aid). For example, one might represent all females with a single female “prototype” head, and similarly for male and children. In general, one would populate the database with several “prototype heads”. Different head characteristics may e.g. include head (size, form), ear (size, form, location), nose (size, form, location), hair style, shoulders (form and size), etc.
The frequency dependent acoustic transfer functions ATFm may be absolute or relative, i.e., acoustic transfer functions from a source position to different microphones, sampled at a particular frequency, divided by the acoustic transfer function from the source position to a particular microphone (a reference microphone), at the same frequency.
The hearing aid may comprise a BTE-part adapted for being located at or behind an ear (pinna) of the user (see e.g.
The processor may be configured to determine the personalized ATF-vectors ATF*θ*,φ*,p′ for the user based on a) (the dictionary Δp′ of) the database Θ, b) the electric input signals xm(n), m=1, . . . , M, and c) the model of the acoustic propagation channel with a predefined or adaptively adjusted frequency. The personalized ATF-vectors may e.g. be determined continuously, e.g. every time frame l. The personalized ATF-vectors may e.g. be determined when triggered by a trigger indicating ‘a new acoustic situation’ (in dependence of a change in sound level or spectral content, e.g. above certain threshold measures). The personalized ATF-vectors may e.g. be determined in dependence of the activation of a specific hearing aid program or mode of operation (e.g. a power-up of the hearing aid system).
The frequency dependent acoustic transfer functions ATF may comprise absolute acoustic transfer functions AATF. To determine the absolute acoustic transfer functions, AATF-vectors Hθ, of the dictionary Δp, let x1(n), . . . , xM(n) denote the (preferably clean) test signals picked up by the microphones of the hearing aid (e.g. recorded in an acoustic laboratory, e.g. in an anechoic room). An (absolute) acoustic transfer function AATF (sometimes termed head related transfer function, HRTF) Hm(θ, k) from a given direction or location (θ) to the mth microphone, may be estimated using procedures known in the art. It may e.g. be beneficial that the test sound signal is a chirp signal (a tonal signal whose frequency increases with time); in this case, the AATF may be estimated using the procedure outlined in [Farina, 2000]. The hearing aid(s) may e.g. be assumed to be mounted as intended (intended orientation of the microphones) on the head of a given artificial or natural person.
The frequency dependent acoustic transfer functions ATF may comprise relative acoustic transfer functions RATF. To determine the relative acoustic transfer functions (RATF), e.g. RATF-vectors dθ, of the dictionary Δp, from the corresponding absolute acoustic transfer functions, Hθ, the element of RATF-vector (dθ) for the mth microphone and direction (θ) is dm(k, θ)=Hm(θ, k)/Hi(θ, k), where Hi(θ, k) is the (absolute) acoustic transfer function from the given direction or location (θ) to a reference microphone (m=i) among the M microphones of the microphone system. Such absolute and relative transfer functions (for a given artificial or natural person) can be estimated (e.g. in advance of the use of the hearing aid system) and stored in the database Θ (or in a (sub-)dictionary Δp (for a specific artificial or natural person) of the database Θ). The resulting (absolute) acoustic transfer function (AATF) vector Hθ for sound from a given direction or location (θ) is denoted as
H(θ,k)=[H1(θ,k) . . . HM(θ,k)]T,k=1, . . . ,K
and the relative acoustic transfer function (RATF) vector do from this direction or location is denoted as
d(θ,k)=[d1(θ,k) . . . dM(θ,k)]T,k=1, . . . ,K
Absolute and/or relative transfer functions at different frequencies for different directions/locations of the sound source, for different artificial and/or natural persons (p), for different hearing aid styles (microphone system configurations), for different hearing aid orientations, etc. may be measured in advance of use of the hearing aid in question, and stored in relevant (person-specific) dictionaries (Δp) of the database Θ.
The database Θ may comprise a number of different dictionaries Δp, e.g. corresponding to different (artificial or natural) persons (e.g. having different acoustic propagation properties, e.g. caused or influenced by different heads (or head characteristics)). The number of different dictionaries may be 1, or any number larger than 1. The relevant number of different dictionaries for a given application may be determined as a compromise between accuracy of the personalized relative transfer functions and computational complexity. To limit the computational complexity, it is of interest to keep the number of different dictionaries below a maximum value, e.g. below 100, such as below 10.
The different hearing aid-orientations (go) on the user's head may relate to a microphone axis through at least two of the multitude of microphones, cf. e.g.
The hearing aid may be adapted to provide that a reference microphone axis through at least two of the multitude of microphones located in a horizontal plane and pointing in a forward direction of the user may be defined, when the hearing aid is correctly (as intended) mounted at or in the user's ear. The reference direction may e.g. be defined as a direction of the user's nose when looking straight ahead, cf. e.g. LOOK-DIR, or REF-DIRL, REF-DIRR in
The database Θ may comprise a multitude P of dictionaries Δp, p=1, . . . , P, comprising AATF-vectors Hθ,φ,p and/or RATF-vectors dθ,φ,p for a corresponding multitude of different natural or artificial persons (p). The processor may be configured to determine the personalized (frequency dependent) AATF-vectors H*θ,φ,p or RATF-vectors d*θ,φ,p for the user based on the multitude P of dictionaries Δp, p=1, . . . , P, of the database Θ, the electric input signals xm(n), m=1, . . . , M, and the model of the acoustic propagation channels. The different persons (p) may exhibit different heads. The elements of the person specific AATF vectors Hθ,φ,p or RATF vectors dθ,φ,p are absolute or relative transfer functions Hm(θ, φ, p, k), and dm(θ, φ, p, k), respectively, representing direction- or location-dependent (θ), person dependent (p, p=1, . . . , P), hearing aid-orientation dependent (φ), and frequency dependent (k) absolute acoustic transfer functions Hm(θ, φ, p, k) from said target signal source to each of said M microphones (m=1, . . . , M) and relative acoustic transfer functions relative dm(θ, φ, p, k) with respect to a reference microphone (m=i) among said M microphones, respectively. The number of different persons P may e.g. be larger than 2, such as larger than 5, such as larger than 10, e.g. in the range between 2 and 100.
The multitude of dictionaries Δp, p=1, . . . , P, may be associated with a corresponding multitude of different natural or artificial persons (p) having different heads, e.g. heads of different size, or different form, or different characteristic features, e.g. hair, nose, ear size or position on the head, etc. The multitude of dictionaries Δp, p=1, . . . , P, may be associated with different artificial or natural persons for which specific AATF or RATF-vectors (Hθ,φ,p, dθ,φ,p) have been generated, e.g. measured.
The personalized AATF or RATF-vector (H*θ, d*θ) for said user may be determined for different frequency indices (k) using the same AATF or RATF-vectors Hθ,φ, dθ,φ or Hθ,p, dθ,p or Hθ,p,φ, dθ,p,φ for some or all frequency indices (k) to estimate a given personalized AATF or RATF-vector (H*θ, d*θ). The processor may be configured to determine the personalized AATF or RATF-vector (H*θ or d*θ), respectively, for the user from the individual AATF or RATF-vectors Hθ,φ, dθ,φ or Hθ,p, dθ,p or Hθ,p,φ, dθ,p,φ of the database Θ jointly across some of or all frequency bands k. In other words: The personalized AATF or RATF-vector (H*θ, d*θ) for the user is found by choosing the (same) candidate AATF or RATF vector for some or all frequency bands (not using one element, e.g. Hm(θ, φ′, k′) or dm(θ, φ′, k′) for one value of frequency index k (k′) and another element Hm(θ, φ″, k″) or dm(θ, φ″, k″) for another value of k (k″)).
The personalized AATF or RATF-vector H*θ or d*θ, respectively, for the user may be determined by a statistical method or a learning algorithm. The personalized AATF or RATF-vector H*θ or d*θ for the user may be determined by a number of different methods available in the art, e.g. maximum likelihood estimate (MLE) methods, cf. e.g. EP3413589A1. Other methods may include statistical methods, e.g. Mean Squared Error (MSE), regression analysis (e.g. Least Squares (LS)), e.g. probabilistic methods (e.g. MLE), e.g. supervised learning (e.g. a neural network algorithm).
The personalized AATF or RATF-vector H*θ or d*θ for the user may be determined by minimizing a cost function. The personalized ATF-vector ATF* for the user may be associated with values of the specific person p=p*, the specific direction/location θj=θ* to/of the sound source, and the specific hearing aid-orientation φ* that best match the optimization criterion (e.g. a cost function). The hearing aid may e.g. be configured to log one or more of said personalized parameters (e.g. the person p*), e.g. in a learning mode of operation, cf. e.g.
In case the logged (possibly qualified by a signal quality parameter) personalized parameter p* is consistently equal to a specific value pu of p, the dictionary Δpu of ATF-vectors associated with that person (pu) may be used by the hearing aid instead of the proposed estimation scheme based on the dataset comprising a number of dictionaries for different persons. Such transition may be performed in a fitting session, e.g. after a certain period of logging, e.g. a month or several months (e.g. 3 months), or after a learning mode of operation of the hearing aid system, cf. below. The hearing aid may be configured to perform the transition itself in dependence of the logged data and a transition criterion.
The hearing aid system may be configured to log the estimated personalized ATF-vectors ATF* over time and thereby building a database of personalized acoustic transfer functions for different directions/locations. The hearing aid may e.g. be configured to only log personalized ATF-vectors ATF* that are associated with a quality (e.g. an estimated SNR) of the electric input signals is above a certain threshold value.
For each given values of the electric input signals, the processor may be configured to try out (e.g. evaluate) the same dictionary Δp of AATF or RATF-vectors Hθ,φ,p or dθ,φ,p for a given person (p) for all values of the frequency index k, k=1, . . . , K. For given electric input signals, the processor may be configured to try out (e.g. evaluate) each of the dictionaries Δp of AATF or RATF-vectors Hθ,φ,p, dθ,φ,p for different persons p, p=1, . . . , P, that correspond to the candidate direction to or location (θ) for the multitude of different hearing aid-orientations φq, q=1, . . . , Q, on the head of said person (p) for all values of the frequency index k, k=1, . . . , K.
The processor may be configured to select the AATF or RATF vector Hθ,φ,p, dθ,φ,p corresponding to a specific hearing aid orientation (φ) that is optimal as the personalized AATF or RATF-vector H*θ or d*θ, respectively, for said user in the given acoustic situation. ‘Optimal’ may be with respect to a cost function, e.g. the RFTF-vector that minimizes the cost function, e.g. optimal in a maximum likelihood sense. The processor may be configured to select the AATF or RATF vector Hθ,φ,p, dθ,φ,p corresponding to a specific direction or location and a specific hearing aid orientation that is optimal as the personalized AATF or RATF-vector H*θ or d*θ, respectively, for the user. Thereby a specific direction to or location of the target sound source AND/OR a specific hearing aid orientation may be estimated.
The processor may be configured to select the AATF or RATF vector Hθ,φ,p, dθ,φ,p corresponding to a specific person (p) and a specific hearing aid orientation (φ) that is optimal as the personalized AATF or RATF-vector H*θ or d*θ, respectively, for said user in the given acoustic situation. The processor may be configured to select the AATF or RATF vector Hθ,φ,p, dθ,φ,p corresponding to a specific person (p*) and a specific hearing aid orientation (φ*) that is optimal as the personalized AATF or RATF-vector H*θ d*θ for said user. ‘Optimal’ may be with respect to a cost function, e.g. the AATF or RFTF-vector that minimizes the cost function, e.g. optimal in a maximum likelihood sense. Thereby a specific direction to or location of the target sound source AND/OR a specific ‘person’ (e.g. a specific head size and/or other body related characteristics) AND/OR a specific hearing aid orientation may be estimated.
The hearing aid may comprise the database Θ. The hearing aid may comprise memory wherein said database Θ is stored. The memory may be connected to the processor. The processor may form part of the hearing aid. The hearing aid may comprise the database as well as the processor. The hearing aid system may be constituted by the hearing aid. The hearing aid may be configured—at least in a specific mode of operation (e.g. in a learning mode of operation) or in a specific hearing aid program—to use the database to determine personalized absolute or relative acoustic transfer functions in a given acoustic situation (reflected by specific electric input signals from the M microphones of the microphone system).
The hearing aid may comprise a signal processor for processing the electric input signals and providing a processed output signal in dependence of one or more processing algorithms, e.g. a beamforming algorithm, a noise reduction algorithm, a compression algorithm, etc.
The hearing aid may comprise an output transducer connected to the signal processor for presenting the processed signal as stimuli perceivable by the user as sound from the environment (at least in a normal mode of operation).
The hearing aid may comprise a beamformer filter configured to provide a spatially filtered signal based on said electric input signals and beamformer weights, wherein the beamformer weights are determined using said personalized AATF or RATF-vector H*θ d*θ for said user. The filter weights may be adaptively updated. The hearing aid may be configured to apply the personalized absolute or relative acoustic transfer functions in the beamformer filter to determine (e.g. adaptively update) said beamformer weights. Filter weights of an MVDR beamformer may e.g. be determined (such as adaptively determined) as discussed in EP3253075A1.
The database Θ may be stored on a device or a system accessible to the hearing aid, e.g. via communication link. The database Θ may be stored in a memory of an auxiliary device connectable to the hearing aid via a communication link. The processor may form part of the auxiliary device. The database Θ may be stored on a server accessible to the hearing aid, e.g. via communication network, e.g. the Internet.
The hearing aid may be constituted by or comprising an air-conduction type hearing aid, a bone-conduction type hearing aid, a cochlear implant type hearing aid, or a combination thereof.
The hearing aid system may comprise an auxiliary device wherein the database Θ is stored. The hearing aid and the auxiliary device may comprise antenna and transceiver circuitry allowing data to be exchanged between them. The database Θ may be stored in a memory of the auxiliary device connectable to the hearing aid via a communication link. The processor may form part of the auxiliary device. The auxiliary device may comprise a remote control device or APP for controlling functionality of the hearing aid system. The auxiliary device may be constituted by or comprise a smartphone or another portable electronic device with communication (and processing) capability.
The hearing aid system may be configured to comprise a learning mode of operation of a certain (e.g. configurable) duration, wherein an optimal person (p*) selected from the database Θ is determined based on a specific, e.g. exhaustive, search involving a multitude (such as all) of the AATF or RATF-vectors Hθ,φ,p or dθ,φ,p of a multitude, such as all, of the directories Δp for the persons (p=1, 2, . . . , P) represented in the database Θ (e.g. for selected or all locations (θj), orientations (φq), and values of the frequency index k, k=1, . . . , K). Selected locations or orientations or frequencies may be provided based on prior knowledge (e.g. to focus on realistic values for the current situation). When the learning mode is terminated, a normal mode of operation may be entered, wherein the most commonly determined optimal person during the learning mode is selected as the best representation (p**, cf. e.g.
A (possibly short) learning mode of operation may be entered during or after each power-up of the hearing aid system (or on request, e.g. from a user interface). A short learning mode of operation may e.g. have a duration of the order of minutes, e.g. less than thirty, or less than ten, or less than five minutes. During the short learning mode, the best representation of the current user, in terms of candidate head-and-torso characteristics, of the hearing aid system may be estimated and e.g. held fixed in a subsequent normal mode of operation of the hearing aid system, cf.
Further, the orientation (φq) (e.g. tilt angle) of the hearing aid on the user's head may be assumed not to change fast over time. Any (large) deviation from normal or intended may be assumed to be introduced during mounting of the hearing aids in connection with power-up of the device. Hence, as indicated for the detection of the optimal person (p**), an optimal orientation (φq*) may be consecutively determined (for different electric input signals) for a given hearing aid in the learning mode (e.g. after a power-up) and kept fixed in a subsequent normal mode of operation. The optimal orientation (φq*) may e.g. be determined after the optimal person (p**) has been determined based on a specific, e.g. exhaustive, search involving a multitude (such as all) of the AATF or RATF-vectors Hθ,φ,p** or dθ,φ,p** of the directory Δp** for the optimal person (p=p**) in the database Θ (e.g. for selected or all locations (θj), orientations (φq), and values of the frequency index k, k=1, . . . , K). Again, when the learning mode is terminated, a normal mode of operation may be entered, wherein the most commonly determined optimal person (p**) during the learning mode is selected as the best representation (p**, cf. e.g.
In case of a binaural hearing aid system, the left and right hearing aids may be assumed to be associated with the same person (head), e.g. by exchanging information between the two hearing aids of the binaural hearing aid system. In case of a binaural hearing aid system, the left and right hearing aids may be assumed to be associated with different orientations of the hearing aid (e.g. determined independently by the left and right hearing aids of the binaural hearing aid system).
In a normal mode of operation, after a learning mode of operation, at least the most optimal person (p**), in terms of head-and-torso characteristics, may have been determined and optionally also the most optimal orientation ((φq**) of the hearing aid (or the hearing aids at the left and right ears of the user). In such case, the remaining calculations to determine appropriate acoustic transfer functions for given electric input signals is hugely diminished (to only include the different location or direction of arrival (θj) of the target sound). In case of a binaural hearing aid system, the left and right hearing aids may be assumed to be associated with the same location or direction of arrival (θj) of the target sound at the left and right hearing aids (or at least a known relation between the two angles of the left and right hearing aids as defined by the specific location of the target source relative to the center of the head and the distance between the left and right hearing aids, cf. e.g. EP2928214A1, e.g. FIG. 3D).
When the optimal person (p**) has been determined in a learning mode of operation, it may be assumed constant (e.g. until a, e.g. user-initiated, ‘recalibration’ is performed), and only the orientation of the hearing aid is determined in a learning mode after a power-up of the hearing aid (system).
A short learning mode of operation (e.g. of 5-10 minutes' duration) may e.g. be repeated with a certain frequency, e.g. every hour or evert three hours, to check the validity of the determined (optimized) person (e.g. head) and/or orientation (e.g. tilt angle) of the hearing aid(s). The learning mode operation may also be repeated if sensor—e.g., accelerometers on board the hearing aid—suggest that the hearing aids have been remounted.
A ‘normal mode of operation’ may in the present context be taken to mean other modes of operation than the ‘learning mode’.
It is the intention that the features of the hearing aid described below can be combined with the first hearing aid system in various embodiments.
A Second Hearing Aid System:
In a further aspect, a second hearing aid system comprising a hearing aid configured to be worn on the head at or in an ear of a user is provided. The hearing aid may comprise
The hearing aid system may comprise
The database Θ may comprise a multitude P of dictionaries Δp, p=1, . . . , P, where p is a person index, said dictionaries comprising ATF-vectors ATF for a corresponding multitude of different natural or artificial persons (p), and
the processor may be configured to, at least in a learning mode of operation, determine personalized ATF-vectors ATF* for said user based on said multitude of dictionaries Δp of said database Θ, said electric input signals xm(n), m=1, . . . , M, and said model of the acoustic propagation channels.
The multitude of dictionaries Δp may each comprise hearing aid-orientation specific ATF vectors ATFθ,φ,p for a multitude of different hearing aid-orientations φq, q=1, . . . , Q, on the head of said person (p), for said multitude of different directions or locations θj, j=1, . . . , J, each ATF vector (ATFθ,φ,p) being frequency dependent (k=1, . . . , K), e.g. as illustrated in
It is the intention that the features of the first hearing aid system and the features of the hearing aid can be combined with the second hearing aid system.
A Hearing Aid:
The (first and second) hearing aid system may comprise or be constituted by a hearing aid.
The hearing aid may be adapted to provide a frequency dependent gain and/or a level dependent compression and/or a transposition (with or without frequency compression) of one or more frequency ranges to one or more other frequency ranges, e.g. to compensate for a hearing impairment of a user. The hearing aid may comprise a signal processor for enhancing the input signals and providing a processed output signal.
The hearing aid may comprise an output unit for providing a stimulus perceived by the user as an acoustic signal based on a processed electric signal. The output unit may comprise a number of electrodes of a cochlear implant (for a CI type hearing aid) or a vibrator of a bone conducting hearing aid. The output unit may comprise an output transducer. The output transducer may comprise a receiver (loudspeaker) for providing the stimulus as an acoustic signal to the user (e.g. in an acoustic (air conduction based) hearing aid). The output transducer may comprise a vibrator for providing the stimulus as mechanical vibration of a skull bone to the user (e.g. in a bone-attached or bone-anchored hearing aid).
The hearing aid may comprise an input unit for providing an electric input signal representing sound. The input unit may comprise an input transducer, e.g. a microphone, for converting an input sound to an electric input signal. The input unit may comprise a wireless receiver for receiving a wireless signal comprising or representing sound and for providing an electric input signal representing said sound. The wireless receiver may e.g. be configured to receive an electromagnetic signal in the radio frequency range (3 kHz to 300 GHz). The wireless receiver may e.g. be configured to receive an electromagnetic signal in a frequency range of light (e.g. infrared light 300 GHz to 430 THz, or visible light, e.g. 430 THz to 770 THz).
The hearing aid may comprise a directional microphone system adapted to spatially filter sounds from the environment, and thereby enhance a target acoustic source among a multitude of acoustic sources in the local environment of the user wearing the hearing aid. The directional system may be adapted to detect (such as adaptively detect) from which direction a particular part of the microphone signal originates. This can be achieved in various different ways as e.g. described in the prior art. In hearing aids, a microphone array beamformer is often used for spatially attenuating background noise sources. Many beamformer variants can be found in literature. The minimum variance distortionless response (MVDR) beamformer is widely used in microphone array signal processing. Ideally the MVDR beamformer keeps the signals from the target direction (also referred to as the look direction) unchanged, while attenuating sound signals from other directions maximally. The generalized sidelobe canceller (GSC) structure is an equivalent representation of the MVDR beamformer offering computational and numerical advantages over a direct implementation in its original form.
The hearing aid may comprise antenna and transceiver circuitry (e.g. a wireless receiver) for wirelessly receiving a direct electric input signal from another device, e.g. from an entertainment device (e.g. a TV-set), a communication device, a wireless microphone, or another hearing aid. The direct electric input signal may represent or comprise an audio signal and/or a control signal and/or an information signal. The hearing aid may comprise demodulation circuitry for demodulating the received direct electric input to provide the direct electric input signal representing an audio signal and/or a control signal e.g. for setting an operational parameter (e.g. volume) and/or a processing parameter of the hearing aid. In general, a wireless link established by antenna and transceiver circuitry of the hearing aid can be of any type. The wireless link may be established between two devices, e.g. between an entertainment device (e.g. a TV) and the hearing aid, or between two hearing aids, e.g. via a third, intermediate device (e.g. a processing device, such as a remote control device, a smartphone, etc.). The wireless link may be used under power constraints, e.g. in that the hearing aid may be constituted by or comprise a portable (typically battery driven) device. The wireless link may be a link based on near-field communication, e.g. an inductive link based on an inductive coupling between antenna coils of transmitter and receiver parts. The wireless link may be based on far-field, electromagnetic radiation. The communication via the wireless link may be arranged according to a specific modulation scheme, e.g. an analogue modulation scheme, such as FM (frequency modulation) or AM (amplitude modulation) or PM (phase modulation), or a digital modulation scheme, such as ASK (amplitude shift keying), e.g. On-Off keying, FSK (frequency shift keying), PSK (phase shift keying), e.g. MSK (minimum shift keying), or QAM (quadrature amplitude modulation), etc.
The communication between the hearing aid and the other device may be in the base band (audio frequency range, e.g. between 0 and 20 kHz). Preferably, communication between the hearing aid and the other device is based on some sort of modulation at frequencies above 100 kHz. Preferably, frequencies used to establish a communication link between the hearing aid and the other device is below 70 GHz, e.g. located in a range from 50 MHz to 70 GHz, e.g. above 300 MHz, e.g. in an ISM range above 300 MHz, e.g. in the 900 MHz range or in the 2.4 GHz range or in the 5.8 GHz range or in the 60 GHz range (ISM=Industrial, Scientific and Medical, such standardized ranges being e.g. defined by the International Telecommunication Union, ITU). The wireless link may be based on a standardized or proprietary technology. The wireless link may be based on Bluetooth technology (e.g. Bluetooth Low-Energy technology).
The hearing aid may be or form part of a portable (i.e. configured to be wearable) device, e.g. a device comprising a local energy source, e.g. a battery, e.g. a rechargeable battery. The hearing aid may e.g. be a low weight, easily wearable, device, e.g. having a total weight less than 100 g, e.g. less than 20 g.
The hearing aid may comprise a forward or signal path between an input unit (e.g. an input transducer, such as a microphone or a microphone system and/or direct electric input (e.g. a wireless receiver)) and an output unit, e.g. an output transducer. The signal processor may be located in the forward path. The signal processor may be adapted to provide a frequency dependent gain according to a user's particular needs. The hearing aid may comprise an analysis path comprising functional components for analyzing the input signal (e.g. determining a level, a modulation, a type of signal, an acoustic feedback estimate, etc.). Some or all signal processing of the analysis path and/or the signal path may be conducted in the frequency domain. Some or all signal processing of the analysis path and/or the signal path may be conducted in the time domain.
An analogue electric signal representing an acoustic signal may be converted to a digital audio signal in an analogue-to-digital (AD) conversion process, where the analogue signal is sampled with a predefined sampling frequency or rate fs, fs being e.g. in the range from 8 kHz to 48 kHz (adapted to the particular needs of the application) to provide digital samples xn (or x[n]) at discrete points in time tn (or n), each audio sample representing the value of the acoustic signal at tn by a predefined number Nb of bits, Nb being e.g. in the range from 1 to 48 bits, e.g. 24 bits. Each audio sample is hence quantized using Nb bits (resulting in 2Nb different possible values of the audio sample). A digital sample x has a length in time of 1/fs, e.g. 50 μs, for fs=20 kHz. A number of audio samples may be arranged in a time frame. A time frame may comprise 64 or 128 audio data samples. Other frame lengths may be used depending on the practical application.
The hearing aid may comprise an analogue-to-digital (AD) converter to digitize an analogue input (e.g. from an input transducer, such as a microphone) with a predefined sampling rate, e.g. 20 kHz. The hearing aids may comprise a digital-to-analogue (DA) converter to convert a digital signal to an analogue output signal, e.g. for being presented to a user via an output transducer.
The hearing aid, e.g. the input unit, and or the antenna and transceiver circuitry comprise(s) a TF-conversion unit for providing a time-frequency representation of an input signal. The time-frequency representation may comprise an array or map of corresponding complex or real values of the signal in question in a particular time and frequency range. The TF conversion unit may comprise a filter bank for filtering a (time varying) input signal and providing a number of (time varying) output signals each comprising a distinct frequency range of the input signal. The TF conversion unit may comprise a Fourier transformation unit for converting a time variant input signal to a (time variant) signal in the (time-)frequency domain. The frequency range considered by the hearing aid from a minimum frequency fmin to a maximum frequency fmax may comprise a part of the typical human audible frequency range from 20 Hz to 20 kHz, e.g. a part of the range from 20 Hz to 12 kHz. Typically, a sample rate fs is larger than or equal to twice the maximum frequency fmax, fs≥2fmax. A signal of the forward and/or analysis path of the hearing aid may be split into a number NI of frequency bands (e.g. of uniform width), where NI is e.g. larger than 5, such as larger than 10, such as larger than 50, such as larger than 100, such as larger than 500, at least some of which are processed individually. The hearing aid may be adapted to process a signal of the forward and/or analysis path in a number NP of different frequency channels (NP≤NI). The frequency channels may be uniform or non-uniform in width (e.g. increasing in width with frequency), overlapping or non-overlapping.
The hearing aid may be configured to operate in different modes, e.g. a normal mode and one or more specific modes, e.g. selectable by a user, or automatically selectable. A mode of operation may be optimized to a specific acoustic situation or environment. A mode of operation may include a low-power mode, where functionality of the hearing aid is reduced (e.g. to save power), e.g. to disable wireless communication, and/or to disable specific features of the hearing aid.
The hearing aid may comprise a number of detectors configured to provide status signals relating to a current physical environment of the hearing aid (e.g. the current acoustic environment), and/or to a current state of the user wearing the hearing aid, and/or to a current state or mode of operation of the hearing aid. Alternatively or additionally, one or more detectors may form part of an external device in communication (e.g. wirelessly) with the hearing aid. An external device may e.g. comprise another hearing aid, a remote control, and audio delivery device, a telephone (e.g. a smartphone), an external sensor, etc.
One or more of the number of detectors may operate on the full band signal (time domain) One or more of the number of detectors may operate on band split signals ((time-) frequency domain), e.g. in a limited number of frequency bands.
The number of detectors may comprise a level detector for estimating a current level of a signal of the forward path. The detector may be configured to decide whether the current level of a signal of the forward path is above or below a given (L-)threshold value. The level detector operates on the full band signal (time domain). The level detector operates on band split signals ((time-) frequency domain).
The hearing aid may comprise a voice activity detector (VAD) for estimating whether or not (or with what probability) an input signal comprises a voice signal (at a given point in time). A voice signal may in the present context be taken to include a speech signal from a human being. It may also include other forms of utterances generated by the human speech system (e.g. singing). The voice activity detector unit may be adapted to classify a current acoustic environment of the user as a VOICE or NO-VOICE environment. This has the advantage that time segments of the electric microphone signal comprising human utterances (e.g. speech) in the user's environment can be identified, and thus separated from time segments only (or mainly) comprising other sound sources (e.g. artificially generated noise). The voice activity detector may be adapted to detect as a VOICE also the user's own voice. Alternatively, the voice activity detector may be adapted to exclude a user's own voice from the detection of a VOICE.
The hearing aid may comprise an own voice detector for estimating whether or not (or with what probability) a given input sound (e.g. a voice, e.g. speech) originates from the voice of the user of the system. A microphone system of the hearing aid may be adapted to be able to differentiate between a user's own voice and another person's voice and possibly from NON-voice sounds.
The number of detectors may comprise a movement detector, e.g. an acceleration sensor. The movement detector may be configured to detect movement of the user's facial muscles and/or bones, e.g. due to speech or chewing (e.g. jaw movement) and to provide a detector signal indicative thereof.
The hearing aid may comprise a classification unit configured to classify the current situation based on input signals from (at least some of) the detectors, and possibly other inputs as well. In the present context ‘a current situation’ may be taken to be defined by one or more of
The classification unit may be based on or comprise a neural network, e.g. a rained neural network.
The hearing aid may comprise an acoustic (and/or mechanical) feedback control (e.g. suppression) or echo-cancelling system.
The hearing aid may further comprise other relevant functionality for the application in question, e.g. compression, noise reduction, etc.
The hearing aid may comprise a hearing instrument, e.g. a hearing instrument adapted for being located at the ear or fully or partially in the ear canal of a user, e.g. a headset, an earphone, an ear protection device or a combination thereof. The hearing assistance system may comprise a speakerphone (comprising a number of input transducers and a number of output transducers, e.g. for use in an audio conference situation), e.g. comprising a beamformer filtering unit, e.g. providing multiple beamforming capabilities.
Use:
In an aspect, use of a hearing aid system or a hearing aid as described above, in the ‘detailed description of embodiments’ and in the claims, is moreover provided. Use may be provided in a system comprising audio distribution. Use may be provided in a system comprising one or more hearing aids (e.g. hearing instruments), headsets, ear phones, active ear protection systems, etc., e.g. in handsfree telephone systems, teleconferencing systems (e.g. including a speakerphone), public address systems, karaoke systems, classroom amplification systems, etc.
A Binaural Hearing Aid System:
In a further aspect, a binaural hearing aid system comprising first and second hearing aids as described above, in the detailed description of embodiments and in the claims is provided. The first and second hearing aids may each comprises transceiver circuitry configured to allow data to be exchanged between them, e.g. via an intermediate device (e.g. the auxiliary device) or system.
The binaural hearing aid system may comprise an auxiliary device.
The binaural hearing aid system may be adapted to establish a communication link between the first and second hearing aids and the auxiliary device to provide that information (e.g. control and status signals, possibly audio signals) can be exchanged or forwarded from one to the other.
The auxiliary device may comprise a remote control, a smartphone, or other portable or wearable electronic device, such as a smartwatch or the like.
The auxiliary device may be constituted by or comprise a remote control for controlling functionality and operation of the hearing aid(s). The function of a remote control may be implemented in a smartphone, the smartphone possibly running an APP allowing to control the functionality of the hearing aid system via the smartphone (the hearing aid(s) comprising an appropriate wireless interface to the smartphone, e.g. based on Bluetooth or some other standardized or proprietary scheme).
The auxiliary device may be constituted by or comprise an audio gateway device adapted for receiving a multitude of audio signals (e.g. from an entertainment device, e.g. a TV or a music player, a telephone apparatus, e.g. a mobile telephone or a computer, e.g. a PC) and adapted for selecting and/or combining an appropriate one of the received audio signals (or combination of signals) for transmission to the hearing aid.
An APP:
In a further aspect, a non-transitory application, termed an APP, is furthermore provided by the present disclosure. The APP comprises executable instructions configured to be executed on an auxiliary device to implement a user interface for a hearing aid or a hearing system described above in the ‘detailed description of embodiments’, and in the claims. The APP may be configured to run on cellular phone, e.g. a smartphone, or on another portable device allowing communication with said hearing aid or said hearing system.
Embodiments of the disclosure may e.g. be useful in hearing aid applications such as beamforming, own voice estimation, own voice detection, keyword detection, etc.
A Method of Operating a Hearing Aid or a Hearing Aid System:
A method of operating a hearing aid system comprising a hearing aid configured to be worn on the head at or in an ear of a user is provided. The method comprises
The method may further comprise
The method may further comprise
It is intended that some or all of the structural features of the device or system described above, in the ‘detailed description of embodiments’ or in the claims may be combined with embodiments of the method, when appropriately substituted by a corresponding process and vice versa. Embodiments of the method may have the same advantages as the corresponding device or system.
The method may further comprise
The aspects of the disclosure may be best understood from the following detailed description taken in conjunction with the accompanying figures. The figures are schematic and simplified for clarity, and they just show details to improve the understanding of the claims, while other details are left out. Throughout, the same reference numerals are used for identical or corresponding parts. The individual features of each aspect may each be combined with any or all features of the other aspects. These and other aspects, features and/or technical effect will be apparent from and elucidated with reference to the illustrations described hereinafter in which:
The personalized parameters z*(z=p, θ, φ) may e.g. be stored together with a parameter indicating a quality (e.g. a signal to noise ratio (SNR), or an estimated noise level, or a signal level, etc.) of the electric input signals that were used to determine the parameter value(s) in question. Thereby the logged personalized parameter values may be qualified.
The figures are schematic and simplified for clarity, and they just show details which are essential to the understanding of the disclosure, while other details are left out. Throughout, the same reference signs are used for identical or corresponding parts.
Further scope of applicability of the present disclosure will become apparent from the detailed description given hereinafter. However, it should be understood that the detailed description and specific examples, while indicating preferred embodiments of the disclosure, are given by way of illustration only. Other embodiments may become apparent to those skilled in the art from the following detailed description.
The detailed description set forth below in connection with the appended drawings is intended as a description of various configurations. The detailed description includes specific details for the purpose of providing a thorough understanding of various concepts. However, it will be apparent to those skilled in the art that these concepts may be practiced without these specific details. Several aspects of the apparatus and methods are described by various blocks, functional units, modules, components, circuits, steps, processes, algorithms, etc. (collectively referred to as “elements”). Depending upon particular application, design constraints or other reasons, these elements may be implemented using electronic hardware, computer program, or any combination thereof.
The electronic hardware may include micro-electronic-mechanical systems (MEMS), integrated circuits (e.g. application specific), microprocessors, microcontrollers, digital signal processors (DSPs), field programmable gate arrays (FPGAs), programmable logic devices (PLDs), gated logic, discrete hardware circuits, printed circuit boards (PCB) (e.g. flexible PCBs), and other suitable hardware configured to perform the various functionality described throughout this disclosure, e.g. sensors, e.g. for sensing and/or registering physical properties of the environment, the device, the user, etc. Computer program shall be construed broadly to mean instructions, instruction sets, code, code segments, program code, programs, subprograms, software modules, applications, software applications, software packages, routines, subroutines, objects, executables, threads of execution, procedures, functions, etc., whether referred to as software, firmware, middleware, microcode, hardware description language, or otherwise.
The present application relates to the field of hearing aids, in particular to beamforming/noise reduction.
The present disclosure relates to a hearing aid (e.g. comprising a microphone array), or a binaural hearing aid system, configured to estimate personalized absolute or relative acoustic transfer functions for a user of the hearing aid (or hearing aid system).
The present disclosure is based on the assumption that a dictionary of absolute acoustic transfer function (AATFs) and/or relative transfer functions (RATFs), i.e., acoustic transfer functions from a target signal source to any microphones in the hearing aid system relative to a reference microphone, is available. Basically, the proposed scheme aims at finding the AATF or RATF in the dictionary which, with highest likelihood (or other optimization measure) (among the dictionary entries), was “used” in creating the currently observed (noisy) target signal.
This dictionary element may then be used for beamforming purposes (the absolute or relative acoustic transfer function is an element of most beamformers, e.g. an MVDR beamformer).
Additionally, since each AATF or RATF dictionary element has a corresponding direction or location attached to it, an estimate of the direction of arrival (DOA) is thereby provided. Likewise, since each AATF or RATF dictionary element may have a corresponding hearing aid-orientation associated with it, an estimate of the hearing aid-orientation (or its deviation from an intended orientation) is thereby provided. Likewise, since each AATF or RATF dictionary element may have a corresponding person (or characteristics of the head) associated with it, an estimate of characteristics of the head of the user can thereby be provided.
The database Θ may then—for individual microphones of the microphone system—comprise corresponding values of location of or direction to a sound source (e.g. indicated by horizontal angle θ), and absolute (AATF) or relative transfer functions RATF at different frequencies (AATF(k,θ) or RATF(k,θ), k representing frequency) from the sound source at that location to the microphone in question. The proposed scheme may calculate likelihoods (or other, e.g. cost-function based, measures) for a sub-set of, or all, absolute or relative transfer functions of the database (and thus locations/directions) and microphones and points to the location/direction having largest (e.g. maximum) likelihood (or other measure).
The microphone system may e.g. constitute or form part of a hearing device, e.g. a hearing aid, adapted to be located in and/or at an ear of a user. In an aspect, a hearing system comprising left and right hearing devices, each comprising a microphone system according to the present disclosure is provided. In an embodiment, the left and right hearing devices (e.g. hearing aids) are configured to be located in and/or at left and right ears, respectively, of a user.
The method chooses actual AATFs or RATFs from a dictionary of candidate AATFs or RATFs. Using a dictionary of candidate AATFs or RATFs ensures that the resulting AATF or RATF is physically plausible—it is a way of imposing the prior knowledge that the microphones of the hearing assistive device are located at a particular position on the head of the user. According to the present disclosure, the database is populated with AATFs or RATFs from several (potentially many) different heads, and/or AATFs or RATFs for hearing assistive devices in different position on the ear of the user.
The proposed idea comprises extended dictionaries, where the extension consists of
When trying out dictionary elements (in order to decide the AATFs or RATFs that are active (optimal) for the current situation), we may try out (e.g. evaluate) particular sub-sets of the extended dictionary, for example:
‘Try out’ may e.g. be taken to mean ‘evaluate a likelihood’ of a given candidate among a given subset of transfer functions and to pick out the candidate fulfilling a given optimization criterion (e.g. having a maximum likelihood, ML).
This procedure has the following advantages:
As illustrated in
As indicated in
It is assumed that the same acoustic transfer functions ATFm(θj, φq, p=p′, k) for possible further persons p′ ‘between’ person 1 (
The hearing aid (HD) of
The processor (PRO) and the signal processor (SP) may form part of the same digital signal processor (or be independent units). The analysis filter banks (FB-A1, FB-A2), the processor (PRO), the signal processor (SP), the synthesis filter bank (FBS), and the voice activity detector (VAD) may form part of the same digital signal processor (or be independent units).
The synthesis filter bank (FBS) is configured to convert a number of frequency sub-band signals to one time-domain signal. The signal processor (SP) is configured to apply one or more processing algorithms to the electric input signals (e.g. beamforming and compressive amplification) and to provide a processed output signal (OUT) for presentation to the user via an output transducer. The output transducer (here a loudspeaker SPK) is configured to convert a signal representing sound to stimuli perceivable by the user as sound (e.g. in the form of vibrations in air, or vibrations in bone, or as electric stimuli of the cochlear nerve).
The hearing aid may comprise a transceiver allowing an exchange of data with another device, e.g. a smartphone or any other portable or stationary device or system. The database Θ may be located in the other device. Likewise, the processor PRO may be located in the other device.
The hearing aid (HD), e.g. the processor (PRO), may e.g. be configured to log the estimated personalized ATF-vectors ATF* (e.g. d*θ) over time and thereby building a database of personalized acoustic transfer functions for different directions/locations. The hearing aid, e.g. the processor (PRO), may e.g. be configured to only log personalized ATF-vectors ATF* that are associated with a quality (e.g. SNR) of the electric input signals is above a certain threshold value. In case the logged (possibly qualified by a signal quality parameter) personalized parameter p* is consistently equal to a specific value pu of p, the dictionary Δpu of ATF-vectors associated with that person (pu) may be used by the hearing aid instead of the proposed estimation scheme. The hearing aid, e.g. the processor (PRO), may be configured to perform the transition itself in dependence of the logged data and a transition criterion (e.g. regarding the number of stored directions/locations for which personalized acoustic transfer functions are stored, and/or regarding a minimum time over which the personalized ATF-vectors ATF* have been logged and/or regarding the quality of the estimated ATF-vectors).
The left part of
The following procedure may be followed: For given electric input signals, for each directory Δp, p=1, . . . , P, of the database Θ (or physically plausible subset thereof), find the optimal location (θj*p) for the given directory (corresponding to a person, p) by determining a cost function for of the locations (θj, j=1, . . . , J) (or a subset thereof), and then finally choose the optimum location (θj*) among the P directories (or a subset thereof) as the location (θj*) exhibiting the lowest cost function (e.g. maximum likelihood). Thereby an optimal person (p*) (and optionally the hearing aid orientation (φq*)) can be automatically estimated (as the person (p) (and optionally the hearing aid orientation (φq*)) associated with the directory Δp, from which the location (θj*) having the lowest cost function originates).
The above procedure may be used to determine each of the data points in the learning mode of
After the learning mode has been finalized, the person (p**) (
It is intended that the structural features of the devices described above, either in the detailed description and/or in the claims, may be combined with steps of the method, when appropriately substituted by a corresponding process.
As used, the singular forms “a,” “an,” and “the” are intended to include the plural forms as well (i.e. to have the meaning “at least one”), unless expressly stated otherwise. It will be further understood that the terms “includes,” “comprises,” “including,” and/or “comprising,” when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. It will also be understood that when an element is referred to as being “connected” or “coupled” to another element, it can be directly connected or coupled to the other element but an intervening element may also be present, unless expressly stated otherwise. Furthermore, “connected” or “coupled” as used herein may include wirelessly connected or coupled. As used herein, the term “and/or” includes any and all combinations of one or more of the associated listed items. The steps of any disclosed method is not limited to the exact order stated herein, unless expressly stated otherwise.
It should be appreciated that reference throughout this specification to “one embodiment” or “an embodiment” or “an aspect” or features included as “may” means that a particular feature, structure or characteristic described in connection with the embodiment is included in at least one embodiment of the disclosure. Furthermore, the particular features, structures or characteristics may be combined as suitable in one or more embodiments of the disclosure. The previous description is provided to enable any person skilled in the art to practice the various aspects described herein. Various modifications to these aspects will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other aspects.
The claims are not intended to be limited to the aspects shown herein but are to be accorded the full scope consistent with the language of the claims, wherein reference to an element in the singular is not intended to mean “one and only one” unless specifically so stated, but rather “one or more.” Unless specifically stated otherwise, the term “some” refers to one or more.
Number | Date | Country | Kind |
---|---|---|---|
20210249 | Nov 2020 | EP | regional |
Number | Name | Date | Kind |
---|---|---|---|
10219083 | Farmani | Feb 2019 | B2 |
10631102 | Jensen | Apr 2020 | B2 |
20060075422 | Choi | Apr 2006 | A1 |
20150163602 | Pedersen | Jun 2015 | A1 |
20180262849 | Farmani | Sep 2018 | A1 |
20180359572 | Jensen | Dec 2018 | A1 |
Number | Date | Country |
---|---|---|
2928214 | Oct 2015 | EP |
3253075 | Dec 2017 | EP |
3285501 | Feb 2018 | EP |
3413589 | Dec 2018 | EP |
Number | Date | Country | |
---|---|---|---|
20220174428 A1 | Jun 2022 | US |