The present invention is generally directed to a speech discrimination testing system, and a method for testing the speech comprehension ability of a person using a speech discrimination system.
Hearing tests are commonly used to evaluate the sensitivity of a person's hearing and the ability of a person to comprehend speech. There are many types of hearing tests.
Pure Tone Audiometry is one of the more common hearing tests used, and measures the ability of a person to hear pure tone sound frequencies, typically between 250 Hz to 8,000 Hz. The test applies to both air and bone conduction audiometry, and may therefore be used to identify various causes of hearing loss.
Evaluation of candidacy for cochlear implantation typically involves consideration of both the degree of hearing threshold elevation (i.e. testing the minimum sound level of a pure tone that the subject can hear—Pure Tone Audiometry), and the accuracy with which a subject understands speech while wearing hearing aids, with the greater weight given to the latter.
The Hearing in Noise Test (‘HINT’) is another existing test, and measures a person's ability to hear speech in a noisy environment. During the HINT test, a subject is required to repeat sentences presented to them with: no competing noise, and with competing noise directed from various locations around the subject. The test measures signal-to-noise ratio, formed by the level of the sentences relative to the noise, needed for the subject to correctly repeat 50% of the words in the sentences.
There are several shortcomings identified with the HINT test, and similar language-based hearing tests. In order for the subject to correctly repeat sentences partially masked by noise, the subject must be capable of understanding the sentences played. This in turn requires the subject to be highly familiar with the language and phrases from which the presented sentences are derived. The test may therefore require variation to suit the native language of the subject (i.e. the test must be presented in a different language). This in turn affects the inherent difficulty of the test for different subjects, since certain languages, phrases and words may be easier to comprehend above noise than others (i.e. since the test is not exactly the same for different subjects, the difficulty of the test is subject to variation). Also, a physician may be required to carry out the test. Despite the availability of speech recognition software, the physician may still be required to assess the subject's ability to repeat sentences heard above noise. An additional shortcoming is that performance on a sentence test also depends on the subject's ability to use syntactic knowledge (i.e. grammar) and semantic knowledge (i.e. the meaning of words) to make inferences about the identity of individual words that have been masked by the background noise, and on working memory ability. These abilities vary among people, hence affect the score on the test, yet do not necessarily relate to the physical hearing abilities of the subject.
Thus, even though it is widely recognised that speech comprehension testing has proven useful, particularly as an indicator of candidacy for cochlear implantation, speech comprehension testing methods have not been widely adopted in many countries. This has in turn affected the number or referrals for cochlear implant candidacy as now discussed.
Cochlear implants are medical devices which bypass the damaged inner ear to electrically stimulate auditory neurons for improving sound and speech perception in people with severe to profound hearing loss. Despite the substantial speech perception benefits provided by cochlear implantation, there is a significant under-referral of adult patients for cochlear implant candidacy evaluation. Previous research has suggested that less than 6% of adults who need or would benefit from cochlear implantation actually receive them, partly as a consequence of the lack of appropriate assessment tools to identify those who require referrals for candidacy evaluation.
Returning to hearing tests generally, attempts have been made to address shortcomings in existing hearing tests, as now illustrated.
U.S. Pat. No. 7,28,8071 B2 (the disclosure of which is herein incorporated by reference in its entirety) discloses an automated hearing test system capable of performing several tests, including for speech discrimination. In the speech discrimination test described, the subject is audibly presented with words and required to select from words either graphically represented in writing or as a picture. For example, the subject may be audibly presented with the word ‘horse’ and then asked to select the image of a horse from a selection of other images. The disclosed test eliminates the need for a hearing professional, such as a physician, to either perform or attend the test. However, the disclosed test does not address any of the language dependent issues identified above. In particular, a different test for each language that may be spoken by subjects is still required, the relative difficulty of the test may still be affected by the choice of language, and the test may be affected by the subject's ability to use semantic knowledge.
U.S. Pat. No. 8,844,358 B2 (the disclosure of which is herein incorporated by reference in its entirety) discloses a method of performing a hearing test on-line. According to the test proposed, meaningless syllables, called ‘logatomes’, are presented to a test person in fluctuating interference noise. In particular, at least one meaningless syllable is presented to a subject and the subject is required to identify and select the meaningless syllable he or she has heard from a range of graphically presented meaningless syllables displayed on a monitor or similar device. The disclosed method presents several disadvantages. In particular, despite the presented syllable being ‘meaningless’, the subject is still required to recognise the sound and either:
U.S. Pat. No. 9,148,732 B2 (the disclosure of which is herein incorporated by reference in its entirety) builds on the test disclosed in U.S. Pat. No. 8,844,358 B2. To improve the accuracy or efficiency of the provided test, voice signals for presentation are dynamically selected taking account of the previous responses of the subject. The adaptive selection of voice signals allows voice signals to be presented that are neither much too difficult nor much too easy for the subject, and prevents having to present the user with the entire inventory of available meaningless syllables.
While U.S. Pat. Nos. 8,844,358 and 9,148,372 both similarly address issues identified with language specific hearing tests, shortcomings may be identified. In particular, the subject is still required to recognise and identify a presented speech sound. If the subject is required to verbally repeat a heard speech sound, the accuracy of the test is necessarily subject to accuracy of speech recognition software, and the ability of speech recognition software to recognise speech of different users with varying first languages and accents. If the test requires the subject to identify and select a visual written signal, the test cannot quickly and simply be applied to any language, since languages necessarily use different writing systems around the world. For example, while the English language may be commonly written using a Latin alphabet, Russian may be written in Cyrillic, and Japanese may be written in kanji and kana. Further, languages utilising a similar writing style may differ in the pronunciation of particular characters. For example, the letter J in Spanish and Serbo-Croation may sound like the letters H and Y respectively in English. As a result, despite moving away from a test requiring recognition of language, the test proposed will still require review and modification to transfer between languages. Lastly, to identify and select different syllables, the test would require the subject to be able to read, or at least to be able to unambiguously associate sounds with symbols representing them.
U.S. Pat. No. 9,131,876 B2 (the disclosure of which is herein incorporated by reference in its entirety) describes a method for determining whether a subject can identify syllables or phonemes. According to the method described, syllable or phoneme sound signals are presented to the subject in isolation, and the system is configured to determine whether each sound signal is audible. The sound signals are presented at several of their constituent frequency bands and the hearing characteristics of the subject are then determined by measuring the audibility of sound signals at those various frequency bands. The method uses the information obtained to determine the amount of amplification needed to make each sound audible, and does not test for the ability of the subject to identify them, not whether they can be discriminated from other sounds. As such, the test is not for example, suitable for use in identifying candidacy for cochlear implantation.
It would be advantageous, though not an essential object of the invention described, to provide a method or device for assessing the ability of a person to hear speech sounds which:
The above discussion of background art is included to explain the context of the present invention. It is not to be taken as an admission that the background art was known or part of the common general knowledge at the priority date of any one of the claims of the specification.
According to a first aspect of the invention, there is provided a method of testing the speech comprehension of a subject, the method comprising:
In an embodiment, the at least one transducer comprises a loudspeaker.
In an embodiment, the speech discrimination testing system further comprises a display for presenting visual images to the subject, and the method further comprises:
In an embodiment, the step of presenting visual images to the subject comprises presenting a sequence of visual images synchronised with the presentation of speech sounds such that each presented speech sound is associated with a presented visual image.
In an embodiment, the display comprises any one or more of the following selection: an electronic visual display, a television, a computer monitor, a touch screen, or a projector and projector screen.
In an embodiment, the display comprises a touchscreen and the subject's identification of the lone speech sound is inputted by touching a visual image presented on the touch screen.
In an embodiment, the method is performed by the subject utilising the speech discrimination testing system without input or assistance from another person.
In an embodiment, speech sounds stored in the inventory of speech sounds are used in the majority of the most commonly spoken languages.
In an embodiment, speech sounds stored in the inventory of speech sounds and selected for presentation follow a vowel-consonant-vowel format, a consonant-vowel-consonant format, a consonant-vowel format, or a vowel-consonant format.
In an embodiment, vowels used in the speech sounds are selected from a group consisting of: [a], [i] and [o]; and the consonants used in the speech sounds are selected from a group consisting of: [j], [k], [l], [m], [n], [p], [t] and [s].
In an embodiment, the speech sounds presented in a sequence vary from one another by substitution of either one vowel, or one consonant.
In an embodiment, speech sounds are presented within a sequence as a consonant pair.
In an embodiment, more than one sequence of speech sounds is presented such that the subject is required to identify lone speech sounds within each presented sequence.
In an embodiment, the method further comprises emitting noise of any type, including random noise, Brownian noise, or competing speech signal or signals, via the at least one transducer while the speech sounds are presented, to provide a signal-to noise-ratio such that the subject is required to discriminate between presented speech sounds while the noise is emitted.
In an embodiment, the signal-to-noise ratio at which speech sounds are presented against emitted noise is adjusted while speech sounds are presented to the subj ect.
In an embodiment, the signal-to-noise ratio is adjusted from sequence to sequence to account for inherent difficulty in discriminating speech sounds presented in each sequence.
In an embodiment, the signal-to-noise ratio is adjusted from sequence to sequence to ensure that each sequence provides substantially the same likelihood of identifying the correct lone speech sound.
In an embodiment, the signal to noise ratio is adjusted from sequence to sequence to ensure that the subject has a likelihood in the range of about 60% to about 80%, preferably about 70%, of identifying the correct lone speech sound for each presented sequence.
In an embodiment, the signal-to-noise ratio is adjusted based on responses received from the subject, to identify a signal-to-noise ratio at which the subject may correctly discriminate speech sounds at a pre-determined ratio of correct responses to incorrect responses.
In an embodiment, the level at which the speech sounds and, if applicable, noise are presented is adjusted to prevent background noise from affecting test performance.
In an embodiment, the method further comprises estimating whether, or the likelihood that, the subject's test results would be improved through cochlear implantation.
In an embodiment, the estimation of whether, or the likelihood that, the subject's test results would be improved through cochlear implantation is calculated in view of how long the subject has experienced hearing loss.
In an embodiment, results or analyses of the test are automatically sent to a hearing professional or the person being tested following completion of the test.
In a second aspect of the invention, there is provided a speech discrimination testing system comprising:
In an embodiment, the at least one transducer comprises a loudspeaker.
In an embodiment, the system further comprises a display for presenting visual images to the subject, and wherein the speech discrimination testing system is further configured to
In an embodiment, the speech discrimination testing system is configured to present visual images to the subject in a sequence of visual images synchronised with the presentation of speech sounds such that each presented speech sound is associated with a presented visual image.
In an embodiment, the display comprises any one or more of the following selection: an electronic visual display, a television, a computer monitor, a touch screen, or a projector and projector screen.
In an embodiment, the display comprises a touchscreen and the speech discrimination testing system is configured to enable input of the subject's identification of the lone speech sound by touching a visual image presented on the touch screen.
In an embodiment, the speech discrimination testing system is configured to enable the subject to complete a speech comprehension test without input or assistance from another person.
In an embodiment, speech sounds stored in the inventory of speech sounds are used in the majority of the most commonly spoken languages.
In an embodiment, speech sounds stored in the inventory of speech sounds follow a vowel-consonant-vowel format, a consonant-vowel-consonant format, a consonant-vowel format, or a vowel-consonant format.
In an embodiment, vowels used in the speech sounds are selected from a group consisting of: [a], [i] and [o]; and the consonants used in the speech sounds are selected from a group consisting of: [j], [k], [1], [m], [n], [p], [t] and [s].
In an embodiment, the speech discrimination testing system is configured to present speech sounds in a sequence such that the presented speech sounds vary from one another by substitution of either one vowel, or one consonant.
In an embodiment, the speech discrimination testing system is configured to present speech sounds within a sequence as a consonant pair.
In an embodiment, the speech discrimination testing system is configured to present more than one sequence of speech sounds such that the subject is required to identify lone speech sounds within in each presented sequence.
In an embodiment, the speech discrimination testing system is configured to emit noise via the at least one transducer while the speech sounds are presented, to provide a signal-to noise-ratio such that the subject is required to discriminate between presented speech sounds while the noise is emitted.
In an embodiment, the speech discrimination testing system is configured to adjust the signal-to-noise ratio at which speech sounds are presented against emitted noise while speech sounds are presented to the subject.
In an embodiment, the speech discrimination testing system is considered to adjust the signal-to-noise ratio from sequence to sequence to account for inherent difficulty in discriminating speech sounds presented in each sequence.
In an embodiment, the speech discrimination testing system is configured to adjust the signal-to-noise ratio such that all sequences presented have approximately the same likelihood of being correctly discriminated, when averaged across a representative group of subjects.
In an embodiment, the speech discrimination testing system is configured to adjust the signal-to-noise ratio based on responses received from the subject, to identify a signal-to-noise ratio at which the subject may correctly discriminate speech sounds at a pre-determined ratio of correct responses to incorrect responses.
In an embodiment, the speech discrimination testing system is configured to adjust the level at which the speech sounds and, if applicable, noise to the subject to minimise the effect that background noise has on the subject's test performance.
In an embodiment, the system is configured to estimate whether, or the likelihood that, the subject's test results would be improved through cochlear implantation following testing.
In an embodiment, the system is configured to estimate whether, or the likelihood that, the subject's test results would be improved through cochlear implantation taking into account how long the subject has experienced hearing loss.
In an embodiment, the speech discrimination testing system is configured to automatically send results to a hearing professional or the person being tested following completion of the test by a subject.
Throughout this specification and the claims which follow, unless the context requires otherwise, the word “comprise” and variations thereof such as “comprises” and “comprising”, will be understood to include the inclusion of a stated integer or step or group of integers or steps but not the exclusion of any other integer or step or groups of integers or steps.
It will be beneficial to further describe the invention with respect to the accompanying drawings, which demonstrate a preferred embodiment of method and device for testing the ability to discriminate speech sounds. Other embodiments of the invention are possible, and consequently, the particularity of the accompanying drawings is not to be understood as superseding the generality of the preceding description of the invention.
In an embodiment the inventors have provided a test based on the ability of a subject to discriminate between speech sounds, rather than on a subject's ability to identify a speech sound, thereby allowing for a test method and device that may be effectively applied across language barriers. Such a test may never require the subject to identify a speech sound (such as by spelling it out or identifying it as a written word or a word of a language), other than to discriminate which sound is different to another that has been presented.
In an embodiment the invention is based on measuring the ability of a subject to correctly discriminate the odd speech sound present in sequence of other speech sounds. For example, pairs of speech sounds may be presented in sequential triplets, in which a first speech sound is presented once and a second speech sound is presented twice (such that the first speech sound is the ‘odd one out’). For example, if two speech sounds presented are [aja] and [ata], the subject may be presented with a sequential triplet of: [aja], [aja], [ata]. The subject would then be required to specify which of the three speech sounds was different to the other two. In this way, the subject is not required to identify either sound but is only required to recognise that one is different to the other. In other embodiments the speech sounds need not be presented as triplets, but rather as quadruplets, quintuplets, etc., and more than two different speech sounds may be presented.
In embodiments of the invention non-word speech sounds may be used as stimuli for a speech discrimination test so as to minimise the undue influence of language and subject learning on the ability of a subject to characterise a sound, and otherwise maximise the language neutrality of the test. This is because, if the presented speech sounds are commonly words of a particular language, then speakers of that language would have the advantage of being familiar with, and hence better able to discriminate those words over other words, phrases or speech sounds that they are less familiar with.
Nonsense speech sounds of various forms may be used according to embodiments of the invention. Nonsense speech sounds may in an embodiment take a consonant-vowel-consonant [CVC] format or a vowel-consonant-vowel [VCV] format. In an embodiment vowel-consonant-vowel [VCV] sounds may be used to maximise the availability of acoustic cues for discrimination by the subject, through perception of the medial consonant.
As part of their development of the invention, the inventors conducted a literature review on phonemic inventories occurring in the most widely spoken languages, being Mandarin, English, Spanish, Arabic, Hindi-urdu, Bengali, Russian, Portuguese, Japanese, German, Javanese, Korean, French, Turkish, Vietnamese, Telugu, Cantonese, Italian, Polish, Ukranian, Thai, Gujarati, Malay, Malayalam, Tamil, Marathi, Burmese, Romanian, Pashto, Dutch, Finnish, Greek, Indonesian, Norwegian, Hebrew, Croatian, Danish, Hungarian, Swedish, and Serbian. This was followed by a literature review on confusion matrices in consonant identification tasks, in noise and in quiet, for adult listeners: having typical hearing, using hearing aids, and using cochlear implants. To demonstrate the process undertaken by the inventors, a phonetic inventory matrix for the above languages respect of vowel sounds is included in
Consonants or vowels which are not common to a popular spoken language, and thereby may not be readily distinguishable by a transducer of that popular language, may be avoided for presentation according to certain embodiments of the invention. For example, speakers of the Hindi language are known to commonly have difficulty discriminating between [v] and [w] sounds, since the two are allophones in that language. Thus, in certain embodiments of the invention, [v] and [w] sounds may be avoided for presentation.
Accordingly, in certain embodiments of the invention, consonants for presentation to a subject may be reduced to a shortlist which avoids consonants that may not be readily distinguishable by speakers of each the most common languages. Such a shortlist may consist of the consonants [p] [t] [k] [m] [n] [s] [l] and [j]. In an embodiment these consonants may in turn be combined with the vowels [a], [i] and [o] to form [CVC] or [VCV] speech sounds which are believed to be generally distinguishable by speakers of the most commonly spoken languages.
Pursuant to the above, it is believed that speakers of the most commonly spoken languages should generally be able to readily discriminate between each of the following speech sounds: [apa], [ata], [aka], [ama], [ana], [asa], [ala], [aja], [ipi], [iti], [iki], [imi], [ini], [isi], [ili], [iji], [opo], [oto], [oko], [omo], [ono], [oso], [olo], and [ojo]. In an embodiment, speech sounds are presented in [VCV] format with the same vowel used twice in a given speech sound. Further, when presenting a pair of speech sounds in a sequence to a subject, a pair of speech sounds presented may share the same vowel, but differ in the selection of a consonant, so as to provide a ‘consonant pair’ of speech sounds presented to the subject. Following this embodiment, suitable speech sound pairs, or consonant pairs, would for example include: [asa] and [ala], or [imi] and [isi]. Presenting the above exemplified [VCV] speech sounds in consonant pairs allows for 84 different combinations of consonant pairs for presentation to a subject in accordance with this embodiment.
In a further embodiment, speech sounds presented together in a sequence need not utilise the same vowel or consonant twice. Speech sounds such as [ato] and [ito], or [ima] and [omi] may therefore be used together, as well as speech sound such as [tom], [tok], [pot] and [son]. In further embodiments, speech sounds need not be presented as consonant pairs. In broader embodiments any speech sounds may be used as part of the test to enable a subject to discriminate between speech sounds.
In certain embodiments enabling self-administration, a hearing discrimination test may be implemented via a system comprising: a desktop or laptop computer, a tablet computer such as an iPad, a smart phone, a home assistant device such as a ‘Google Home’, or another device or combination of devices as further exemplified below. In each of the devices exemplified above, the device comprises a data processor and a memory in data communication with the data processor. In certain embodiments, as further described below, the memory may comprise a hard drive device contained inside or outside a particular device. In other embodiments, the memory may be provided remotely such as via the ‘cloud’. Other embodiments are also described below.
Speech sounds may be presented in sequential triplets, or longer sequences, from a transducer such as a loudspeaker integral to or separate from a device such as those described above. In an embodiment, speech sounds may be presented via headphones, which may be connected via cable, Bluetooth, Wi-Fi or other avenue to another device. In another embodiment the speech sounds may be presented via a larger loudspeaker such as desktop speaker or home stereo system. In another embodiment speech sounds may be presented to the user directly via a hearing aid. In another embodiment, speech sounds may be presented to the user via a bone conductor transducer or other suitable form of sound transducer. Any device which is capable of presenting speech sounds to the user may be employed. Upon presentation of the speech sounds, the subject is presented with options to select which speech sound within a presented sequence is different from those others presented, as further described below.
The memory unit (14) may in some embodiments also store a speech discrimination test (18). More specifically, the memory unit (14) may store a computer-readable version of the speech discrimination test (18) that can be executed by the computer (11). During execution, a portion of the speech discrimination test (18) may be temporarily loaded from, for example, the hard disk and into the main memory components of the memory unit (14). As noted elsewhere, in addition to the stand-alone arrangement described above, it is also possible to execute the speech discrimination test (18) from a network. For example, the speech discrimination test (18) may be stored on a server computer (not expressly shown) that is accessible to several client computers. This arrangement has an advantage in that updates to the speech discrimination test (18) may be quickly and easily implemented. Other environments for executing the speech discrimination test (18) may also be used without departing from the scope of the invention.
The source code for the speech discrimination test (18) may be written in any suitable programming language (e.g.: C, C++, BASIC, Java). In addition, the speech discrimination test (18) can be implemented using a number of different programming methodologies (e.g., top-down, object oriented).
In one embodiment, the methodology of the speech discrimination test (18) involves a plurality of individual modules or object class modules with subroutines, properties and functions that can be called to perform specific tasks. The modules or subroutines can be called from a main routine, which would control the general sequence and flow of the speech discrimination test (18) and from within other modules or subroutines, which would control specific functions or tasks in either an isolated or cooperative manner. This is further exemplified below in respect of
Depending on particular embodiments, other components (16) of the system (10) that may be present include a keyboard, mouse or touchpad, microphone, printer and the like.
In
Visual images may be presented to the subject via any suitable device or mechanism. In an embodiment, as described above, visual images may be presented via a touchscreen. This further enables the subject to select the ‘odd’ speech sound through interaction with the same touch screen. In other embodiments, visual images may be presented by for example a monitor, television, projector, or by another suitable mechanism such as a mechanical display (e.g. pop-up buttons). In other embodiments a screen might not be provided but, for example, a set of lights is provided which light up so that a given light, or colour of light, is associated with a given speech sound. In other embodiments a display may not be provided and the subject may simply be required to select speech sounds based on order, i.e. the subject may select the ‘first’ speech sound as the lone speech sound. Using this approach a blind subject may still have the ability to complete the speech discrimination test through either speech recognition software as further described below, by pressing on an associated ‘first sound’ button, ‘second sound’ button or ‘third sound’ button (based on the number of sounds presented) on a device; or by pressing a single button once, twice or three times as appropriate (again based on the number of sounds presented). A touch pad may for example be used as an alternative to a button.
In embodiments not involving a touchscreen a mechanism, component or device allowing the subject to select the ‘odd’ speech sound may be provided, such as a computer mouse, touchpad, keyboard, button arrangement, remote control, a motion detector (for example recognising a number of hand waves or particular facial movements) or sound detector (for example speech recognition software may allow the subject to simply state ‘first sound’, ‘second sound’ or similar which would be recognised by the software). Using a voice recognition approach, and without requiring a display to provide for visual representations, other devices such as a home assistant (e.g. a ‘Google Home’) or similar could be used to implement the test.
Where visual images are provided, rather than having a visual image appear as the speech sound is presented, in other embodiments the visual image may appear just before or just after the speech sound is presented, or the visual image may be highlighted in some way (such as by the visual image becoming larger, brighter, shimmering, moving, or changing colour), just before, during, or just after the speech sound is presented. Any mechanism which ensures that the subject can associate a presented visual image with a presented speech sound may be used according to embodiments of the invention.
A game-based method and system was developed to test the suitability of embodiments of the invention to test hearing and speech comprehension. In this game-based embodiment, speech sounds were selected to be presented as triplets of different [VCV] consonant pairs, and the method and device were designed to be self-administrable by adults.
Recordings of each speech sound were produced by a native Australian English female talker in an anechoic chamber. The recordings were reviewed by a panel, and two of the six recordings for each speech sound were selected for use in the test. All recordings were high-pass filtered at 250 Hz to minimise the impact of variations in frequency response at low frequencies for transducer outputs of a range of tablet computers and smartphones. The recordings were equalised in overall RMS level.
Preliminary testing was conducted with 50 subjects having hearing impairment. Criteria for subject selection involved hearing impaired adults: with at least one year of experience wearing cochlear implants, with no other co-morbidities affecting their communication ability, and who speak English as a first language. The subjects selected were users of either a single cochlear implant, a cochlear implant combined with a hearing aid, or bilateral cochlear implants. All subjects were tested when using a cochlear implant in one ear alone.
[VCV] speech sounds were presented during testing as consonant pairs (i.e. pairs sharing the same consonant but different vowels, e.g. [aka] and [ata]) at 65 dB SPL in speech-shaped noise. Signal-to-noise ratios were set for different presentation conditions: primarily Quiet, 7 dB signal to noise ratio (‘SNR’), and 0 dB SNR, but at several other SNRs for some of the subjects.
The mean scores per consonant pair (expressed as proportion correct as a function of signal-to-noise ratio) are shown in
The reduced set of consonant pairs, each adjusted with their own sound-to-noise ratio, then became the set of stimuli to be used in the next stage of developing a speech comprehension testing method and device according to embodiments of the invention.
A testing method and system implemented by the inventors on a laptop computer was evaluated to examine whether it would be suitable for objectively testing speech comprehension.
81 adult subjects participated in Phase II of the project. These included 41 adults wearing hearing aids with hearing loss of 4 frequency average hearing level (4FAHL) >40 dB HL, and 40 adults wearing ‘Nucleus’ cochlear implants with at least one year of cochlear implant experience. Table 1 gives the characteristics of participant subjects.
For subjects using hearing aids alone, the distribution of hearing level, expressed as four-frequency average hearing loss (‘4FAHL’) across 0.5, 1, 2 and 4 kHz in the better ear is shown in
Prior to assessments, the hearing devices of the participants were checked. For users of hearing aids, otoscopy and tympanometry were performed to exclude cases of cerumen build up and/or middle ear dysfunction. Behavioural pure tone thresholds were measured in both ears using standard pure tone audiometry if an audiogram within 12 months of the test date was not available. Participants provided demographic information by completing a written questionnaire.
While subjects were wearing their devices (hearing aids or cochlear implants) at their personal settings, they were measured using a speech discrimination test (described as ‘LIT’ in several of the Figures) according an embodiment of the invention, using a laptop computer. The test was self-administered, with minimal input from an assessor. After written instructions were presented on screen, the subject was directed to adjust the overall loudness of presentation by moving a slider to set a comfortable loudness level. This was followed by a practice run after which the subject completed a test run. After every 20 trials of speech sound triplets, the subject could either take a brief break or press a button to continue testing until all 81 triplet sequences (as described above) were completed. The SNR for each triplet was adjusted in accordance with the findings for Stage I described above. Each subject completed two runs of 81 triplet sequences.
While not incorporated as part of the test discussed above, the background noise of the surrounding environment (e.g. household) may be monitored and factored into the volume at which the test is presented to the subject. For example, if the subject is taking the test at home, and there are high levels of background noise in the vicinity, a device such as a computer may monitor the noise levels and automatically adjust the level at which the test is presented, as opposed to having the subject manually adjust loudness levels for comfort.
The below Table 2, and
Similarly, the below Table 3, and
As demonstrated by Table 2 and Table 3, cochlear implant users generally obtained better test results during speech discrimination testing, noting that the mean proportion of correct responses for hearing aid users was 72.4%, while cochlear implant users obtained a mean proportion of 80.1%.
While according to the embodiment analysed during Stage II testing, the metric extracted from subjects relates to the percentage of correct responses, it is nevertheless possible to utilise other metrics according to other embodiments of the invention. For example, by providing an adaptive signal-to-noise ratio during testing based on responses from the subject, it is possible to extract a signal-to-noise ratio at which a subject provides the correct response during the test at a pre-determined rate, such as for example 60%, 70%, or 80% of the time. In this way, a high signal to noise ratio would be indicative of poor speech discrimination abilities, since a high signal to noise ratio would be required to elicit a high proportion of correct responses. Alternatively, the signal-to-noise ratio need not be adjusted at all from sequence to sequence and in other embodiments it may be unnecessary to emit any noise at all during testing.
Product-Moment correlation analyses revealed large and highly significant correlations between the first and second runs of the speech discrimination test for users of cochlear implants (r=0.907, p<0.001) and for users of hearing aids (r=0.876, p<0.001). This indicated that the test used reliably differentiates between subjects with different degrees of ability to discriminate between speech sounds, so as to reliably identify those having limited ability to discriminate speech sounds.
The inventors performed analyses to determine whether a method and system for testing speech discrimination according to embodiments of the invention may be used to estimate whether the user of a hearing aid may benefit from cochlear implantation.
The results were used to adjust the test scores to remove the effect of duration of hearing loss. Then, the probability of a person using hearing aids to score higher with cochlear implants during testing was estimated as the proportion of cochlear implant users whose adjusted score was higher than the non-implanted person's adjusted score.
Further investigations on the impact by item-number on estimated probabilities were carried out using simulated data from the estimated distributions of performance with cochlear implants.
The four rows depict different values for duration of hearing loss. The left panels depict estimates calculated using the least squares method, and the right panels depict estimates calculated using the lasso estimation method. Although
While development of certain embodiments of the software have been undertaken with having regard to determining whether a subject would obtain better test results upon receiving cochlear implantation, it is to be understood that applications of the invention are not so limited. More simply, the test may be used to test the speech discrimination or speech comprehension abilities of a subject. Other applications include the ongoing monitoring of the speech discrimination or speech comprehension skills of a subject through repeated testing to, for example, identify instances of hearing degradation, or improvement in speech discrimination skills following training in the use of a device. In another application, the test may be used by those who do not currently use a hearing aid, cochlear implant or similar so as to determine whether a person has poor speech comprehension or discrimination abilities and may benefit from use of a hearing device such as a hearing aid, or otherwise may benefit from visiting a hearing professional for further analysis.
Having described the development of certain embodiments of the invention as set out above, now described with reference to
After initial power up, the main program module performs an equipment check at step B1 to ensure all components (for example transducers, screens,) of the system are functioning properly. Such a check may involve, for example, comparing the initial calibration data of the equipment with current measurements. In some embodiments, the various components of the equipment may be pre-calibrated together as a unit during manufacture or assembly. The calibration data may then be stored in a storage medium that is connected or attached to or sent together with the equipment. A determination is made at step B3 as to whether the equipment check passed, that is, whether the equipment is within a predetermined percentage of the initial calibration data. If the equipment check fails, then the main program module issues an equipment failure warning at step B4 and returns to the first step B1 to re-check the equipment.
If the equipment check passes, then the main program module proceeds to obtain the subject's information at step B5. This can be done, for example, by prompting the patient to manually enter his or her information (for example: name, address, date of birth, years of hearing difficulties, years of using a hearing aid, etc.), or by loading the information from a previously stored patient file. Here, as throughout the description, manual prompting may be done visually by displaying the instructions as text on a screen, or by audio instructions via a transducer, or by a combination of both. At step B6, the main program module allows the subject to select which test to be performed, for example a speech discrimination in which: no noise is emitted during testing (‘Test A’), in which noise levels during testing are modified from sequence to sequence for ensure that presented consonant pairs are equally difficult to discriminate (‘Test B’), or to modify noise levels during testing based on previous responses to as to identify an SNR at which the subject correctly identifies ‘lone’ speech sounds at a predetermined ratio (‘Test C’). In the flow chart shown it is possible to select multiple tests to be performed one after the other. For simplicity, the flow chart exemplifies a main program module in which only options Test A and Test B are presented to the subject.
After the above selection, the main program module makes a determination as to whether Test A was selected at step B7. If Test A was selected, then at step B8, the main program module presents the subject with Test A according, in certain embodiments, to another module or sub-routine. If Test A was not selected, or otherwise upon completion of Test A, the main program module moves to step B9.
At step B9, the main program module makes a determination as to whether Test B was selected. If Test B was selected, then at step B10, the main program module presents the subject with Test B according, in certain embodiments, to another module or sub-routine. If Test B was not selected, or otherwise upon completion of Test B, the main program module moves to step B11.
At step B11, the main program module alerts the subject that he or she has completed the selected hearing tests. At step B12 the data processor may analyse data obtained testing, however this may alternatively be performed as part of the module or sub-routine of a completed test. At step B13 results and relevant analyses arising from the selected test(s) are presented to the subject and the subject is presented with a list of options at step B14 which includes: forwarding the results to a nearby hearing professional as step B15, printing the results as step B16, or undertaking a new test session, in which main program module returns the subject to step B2.
Now described with reference to
The test commences at step C1 in which the subject has selected Test B from the main program module. At step C2 suitable volume levels for performing the test are determined. This may be manually chosen by the subject upon being presented with sound and a choice to raise or lower presented sound so as to provide comfortable levels Otherwise it may be automatically selected by taking account of background noise levels as detected by a microphone where the test is being performed (so as to, for example, present the test at a predetermined level above background noise).
The test then commences at step C3 in which a selection is made of a consonant pair, and a noise/SNR level to present to the subject. This information may be stored in a memory of the system. The selection may be made randomly from stored consonant pairs, or in a set sequence. To avoid familiarity with the test, in an embodiment the selection is made randomly. At C4 a first speech sound is presented to the subject associated with a first visual representation. This step is repeated for a second speech sound and associated visual representation at C5, and for a third speech sound and associated visual representation at C6. At C7 the subject is presented with three visual representations and required to select which of the visual representations is associated with a ‘lone’ speech sound.
At C8 a time limit may be set on the amount of time provided to the subject to select a visual representation such that, if the subject does not respond in a given time, the subject is returned to step C4 so as to rehear the already-presented consonant pair triplet. In certain embodiments this step is removed since it provides the subject an opportunity to rehear a triplet that presented difficulties, which may not be desirable in certain embodiments. Otherwise, once the subject selects a visual representation the module is taken to step C9, where a determination is made as to whether the subject has correctly identified the visual representation associated with the lone speech sound. If the lone speech sound is correctly identified, it is added to the correct number count at C10. If the lone speech sound is not correctly identified, it is added to the incorrect number count at C11. In an embodiment, data pertaining to the consonant pair that obtained an incorrect response is stored to enable future improvement to the test (such as adjusting SNR levels for a given consonant pair if continued testing demonstrates that the paired speech sounds are more difficult to discriminate than previously understood).
Either way, following a correct response at C10 or an incorrect response at C11, the module is led to C12 in which a determination is made as to whether all consonant pair triplets have been presented to the subject. If not all consonant pair triplets have been presented to the subject, the module returns to C3 where a determination is made as to which consonant pair triplet to present to the subject next (while, in an embodiment, ensuring that consonant pair triplets are not repeatedly presented). If all consonant pair triplets have been presented then the test is stopped at C15 and the number of correct responses is recorded. In certain embodiments the recorded results may be retrieved later, which may be particularly useful where the test is applied to record ongoing speech discrimination skills of a subject through repeated testing over time.
Summary
Based on the above, the inventors have developed a speech discrimination test which may, in an embodiment, be used to determine if a hearing-impaired subject is likely to obtain better test results upon cochlear implantation, as well as for other applications. In an embodiment, since the test may not require the subject to recognise speech sounds as words, a speech discrimination test may be language-neutral, as well as neutral regarding the syntactic and semantic knowledge of the subject. In an embodiment, speech sounds utilised in a speech discrimination test may be specifically selected to enable speakers of most languages to undergo the test.
In an embodiment, the test may be subject-implemented without necessitating the involvement of a hearing professional, such as a physician. Rather, the results of testing, in an embodiment including the likelihood that cochlear implantation would improve any further test results, may be automatically sent to a hearing professional and/or to the person taking the test, such as via email, for further consideration.
Modifications and variations as would be deemed obvious to the person skilled in the art are included within the ambit of the present invention as defined in the claims.
Number | Date | Country | Kind |
---|---|---|---|
2018904679 | Dec 2018 | AU | national |
2019901071 | Mar 2019 | AU | national |
2019901407 | Apr 2019 | AU | national |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/IB2019/001317 | 12/6/2019 | WO | 00 |