Claims
- 1. A method for utilizing a speech utterance to verify an identity of a person, comprising the steps of:
- collecting a representation of an identity asserted by the person;
- collecting an uttered phrase from the person which has a predetermined sequence and converting the phrase into data characteristic of the person's voice;
- comparing the characteristic data with other stored data characteristic of the phrase as spoken by the person to produce a match or a mismatch;
- confirming the identity of the person on a match of said comparison;
- denying the identity of the person on a mismatch of said comparison;
- prompting said speaker a second time in response to a mismatch of said comparison to repeat said phrase;
- comparing characteristic data of the repeated phrase with said other data to again determine a match or mismatch;
- confirming or denying the identity of the person based upon said second comparison; and
- updating said other stored data on determining a match of said comparison by averaging said characteristic data with said other stored data and storing the result thereof for use in subsequent comparisons.
- 2. The method of claim 1 further including inputting said identity asserted by the person as non-spoken identity information.
- 3. The method of claim 1 further including inputting said phrase including information containing said non-spoken identity information.
- 4. The method of claim 3 further including inputting said phrase including information containing a fixed text statement.
- 5. The method of claim 4 further including inputting said fixed text statement including information containing a pair of two-syllable words.
- 6. The method of claim 5 further including inputting said fixed text statement including information containing a pair of words selected for constancy over a geographical area.
- 7. The method of claim 6 further including inputting information containing the first word of said pair of words which includes a place name, and said second word comprises a geographical feature.
- 8. The method of claim 1 further including prompting the person with a phrase having a number of randomly arranged digits.
- 9. The method of claim 8 further including rearranging said words on each attempt by the person to gain access.
- 10. The method of claim 1 wherein said collecting step comprising collecting an identity spoken by the person.
- 11. The method of claim 10 further including processing said spoken identity to determine the words characteristic of the identity.
- 12. The method of claim 11 further including processing said speaker identity a second time to determine the claimed identity of the person.
- 13. The method of claim 1 wherein said comparison is carried out by forming a numerical result thereof and determining whether said numerical result is greater or less than a threshold value to thereby produce said match or mismatch.
- 14. The method of claim 13 further including performing plural comparisons on repeated attempts of speaker verification, and forming a different threshold value for use in each said comparison.
- 15. The method of claim 14 further including utilizing a different threshold in connection with each said comparison, and determining whether said numerical result is greater than or less than said different threshold value.
- 16. The method of claim 13 further including setting said threshold by analyzing a histogram indicating the result of plural true speakers attempting said access and plural imposter speakers attempting said access.
- 17. The method of claim 13 further including maintaining a cumulative average of a number of said numerical results as computed over multiple attempts of the person to gain said access, and maintaining a global average of numerical results of plural different persons attempting to gain said access, and changing the magnitude of a current numerical result in a direction to cause a match when said cumulative average is greater than said global average.
- 18. A method of verifying the identity of individuals using voice as an identifying characteristic, comprising the steps of:
- sampling speech signals and converting said signals into digital data;
- converting said digital data by a linear predictive coding (LPC) technique to define an LPC vector;
- converting said LPC vector into simulated amplitude vectors;
- converting the amplitude vectors into respective uncorrelated features defining principal spectral components (PSC);
- arranging the principal spectral components into a file having a number of frames;
- time warping said file to match a stored reference file of principal spectral component data previously developed on enrollment of said individual;
- processing the time warped principal spectral component data with the stored reference principal spectral component data to produce a numerical result;
- comparing said numerical result with a threshold value; and
- confirm or deny identity of the person based upon the result of said comparison.
- 19. The method of claim 18 further including repeating the method an additional time on a second attempt if access is denied to the individual on a first attempt.
- 20. The method of claim 18 further including changing said threshold value on said second attempt.
- 21. The method of claim 18 further including prompting the individual with a fixed text statement which is repeated by the individual to form the speech signals.
- 22. The method of claim 21 further including prompting the individual with an identity code and with said fixed text statement.
- 23. The method of claim 22 further including prompting the individual with a fixed text statement comprising a pair of two-syllable words to enhance speaker consistency.
- 24. The method of claim 18 further including prompting said individual with a secondary phrase comprising a randomized sequence of words.
- 25. The method of claim 18 further including modifying said numerical result on an individual basis if a cumulative average numerical result of the individual is greater than a global average of numerical results of plural different individuals.
- 26. The method of claim 25 further including modifying said numerical result in a direction so as to grant access to the individual.
- 27. A method of providing access control utilizing speech utterances in a number of attempts defining a session, to verify an identity of a person, comprising the steps of:
- forming a reference template of parameters characteristic of the speaker's speech;
- converting signals of the speaker into similar parameters when said speaker seeks access;
- processing said reference template with said similar parameters to derive a result d.sub.1 indicative of the similarity between said parameters;
- comparing said result d.sub.1 with a threshold value .theta..sub.1 in a first attempt;
- granting or denying access to the person based upon the result of said first attempt comparison; and
- comparing a result d.sub.2 with a threshold value .theta..sub.2 on a second attempt if said first attempt results in a denial of said access.
- 28. The method of claim 27 wherein .theta..sub.1 is less than .theta..sub.2.
- 29. The method of claim 27 further including comparing the result d.sub.2 with a threshold value .theta..sub.2x on said second attempt, where d'.sub.2 is an average of d.sub.1 and d.sub.2.
- 30. The method of claim 29 further including comparing d'.sub.2 with a threshold .theta..sub.2x which is different than or .theta..sub.1 or .theta..sub.2.
- 31. The method of claim 30 wherein said threshold value .theta..sub.2x is given by .theta..sub.1 <.theta..sub.2x .theta..sub.2.
- 32. The method of claim 27 further including comparing a numerical result d.sub.3 with a threshold value .theta..sub.1x on a third attempt, where .theta..sub.1x is less than .theta..sub.1 or .theta..sub.2.
- 33. The method of claim 32 further including comparing a result d.sub.3 ' defined by an average of d.sub.3 and d.sub.2 with a threshold .theta..sub.2x.
- 34. The method of claim 33 further including defining .theta..sub.2x as being greater than .theta..sub.1.
- 35. The method of claim 34 and further including defining .theta..sub.2x as being greater than .theta..sub.1x.
- 36. The method of claim 32 further including comparing a result d.sub.3 " defined by an average of d.sub.1 and d.sub.3, with a threshold .theta..sub.2x.
- 37. The method of claim 32 further including comparing a result d.sub.3 " defined by an average of d.sub.1, d.sub.2 and d.sub.3, with a threshold .theta..sub.3.
- 38. The method of claim 37, further including defining .theta..sub.3 as being greater than .theta..sub.1 and .theta..sub.2.
- 39. A method of providing access control utilizing a speech utterance to verify an identity of a person, comprising the steps of:
- forming a reference template of parameters characteristic of the speaker's speech;
- prompting the speaker to enter a fixed text statement;
- prompting the person to enter a randomized text statement comprising a number of words which are randomly arranged, and which arrangement is generally unknown to the speaker;
- converting the speech signals of said fixed text statement and the speech signals of said randomized text statement voiced by the person into test parameters;
- processing said test parameters with said reference template parameters to derive a result indicative of a match or mismatch therebetween; and
- allowing access to the person if a match is found and denying access to the person if a mismatch is found.
- 40. The method of claim 39 wherein said randomized text statement is derived by selecting a digit sequence having minimal coarticulation effects between the words thereof.
- 41. The method of claim 40 further including forming a reference template associated with said randomized text statement using a first sequence of digits comprising "1,0,3,5,8" and a second sequence of digits comprising "9,7,2,4,6".
- 42. The method of claim 39 further including forming a randomized text reference template using a specific sequence of words, and randomly arranging said words for prompting the person, and comparing parameters of the randomly arranged words spoken by the person with said randomized text reference template.
- 43. The method of claim 42 further including forming a randomized text reference template for each word of the sequence.
- 44. The method of claim 43 further including forming said randomized text reference template using said specific sequence of digits voiced by a population of persons, and deriving speaker-independent digit templates for each said word.
- 45. The method of claim 44 further including forming said randomized text reference template by capturing an utterance of the specified words by a person to be enrolled, transforming the utterance into digital signal values and processing said digital signal values with the randomized text reference template to find a minimum Euclidean distance therebetween, and forming a new template for each word of the randomized text phrase using said minimum distance.
- 46. The method of claim 45 further including forming said new template by scanning each digit field of the person to be enrolled by the speaker-independent template to form a speaker dependent template.
- 47. Apparatus for utilizing a speech utterance to verify an identity of a person, comprising:
- a prompter responsive to an identity asserted by the person for prompting the person to utter a phrase having a predetermined sequence of words;
- a converter for collecting a representation of an uttered phrase of a predetermined sequence of words, and converting the representation into data characteristic of the person's voice;
- a comparator for comparing the characteristic data with other stored data characteristic of the phrase as spoken by the person to produce a match or a mismatch;
- a decision circuit for confirming the identity of the person on a match of said comparison and denying the identity of the person on a mismatch of said comparison;
- a comparator for comparing characteristic data of a repeated phrase on a second attempt by the person with said other data to again determine a match or mismatch;
- a decision circuit for confirming or denying the identity of the person based upon said second comparison; and
- means for updating said other stored data on determining said acceptance by averaging said characteristic data with said other stored data and means for storing the result thereof for use in subsequent comparisons.
- 48. The access control apparatus of claim 47 further including means for inputting said phrase including information containing said non-spoken identity information.
- 49. The access control apparatus of claim 48 further including means for inputting said phrase including information containing a fixed text statement.
- 50. The access control apparatus of claim 49 further including means for inputting said fixed text statement including information containing a pair of two-syllable words.
- 51. The access control apparatus of claim 50 further including means for inputting said fixed text statement including information containing a pair of words selected for constancy over a geographical area.
- 52. The access control apparatus of claim 51 further including means for inputting information containing the first word of said pair of words which includes a place name, and said second word which comprises a geographical feature.
- 53. The access control apparatus of claim 47 further including means for prompting the person with a phrase having a number of randomly arranged words.
- 54. The access control apparatus of claim 53 further including means for rearranging said words on each attempt by the person to gain access.
- 55. The access control apparatus of claim 47 wherein said collecting step comprises means for collecting an identity spoken by the person.
- 56. The access control apparatus of claim 55 further including means for processing said spoken identity to determine the words characteristic of the identity.
- 57. The access control apparatus of claim 56 further including means for processing said speaker identity a second time to determine the claimed identity of the person.
- 58. The access control apparatus of claim 47 wherein said comparison is carried out utilizing means for forming a numerical result thereof and means for determining whether said numerical result is greater or less than a threshold value to thereby produce said match or mismatch.
- 59. The access control apparatus of claim 58 further including means for performing plural comparisons on repeated attempts of speaker verification, and means for forming a different threshold value for use in each said comparison.
- 60. The access control apparatus of claim 59 further including means for utilizing a different threshold in connection with each said comparison, and means for determining whether said numerical result is greater than or less than said different threshold value.
- 61. The access control apparatus of claim 59 further including means for setting said threshold by analyzing a histogram indicating the result of plural true speakers attempting said access and plural imposter speakers attempting said access.
- 62. The access control apparatus of claim 58 further including means for maintaining a cumulative average of a number of said numerical results as computed over multiple attempts of the person to gain said access, and means for maintaining a global average of numerical results of plural different persons attempting to gain said access, and means for changing the magnitude of a current numerical result in a direction to cause a match when said cumulative average is greater than said global average.
- 63. The access control apparatus of claim 62 further including means for modifying said numerical result only when said current numerical result is less than said global average.
- 64. Apparatus for providing personnel access control utilizing speech utterances in a number of attempts defining a session, to verify an identity of a person, comprising:
- a processor for forming a reference template of parameters characteristic of the speaker's speech;
- a converter for converting signals of the speaker into similar parameters when said speaker seeks access;
- a processor for processing said reference template with said similar parameters to derive a result d.sub.1 indicative of the similarity between said parameters;
- a comparator for comparing said result d.sub.1 with a threshold value .theta..sub.1 in a first attempt for granting or dening access to the person based upon the result of said first attempt comparison; and
- a comparator for comparing a result d.sub.2 with a threshold value .theta..sub.2 on a second attempt if said first attempt results in a denial of said access.
- 65. The apparatus of claim 64 wherein .theta..sub.1 is less than .theta..sub.2.
- 66. The apparatus of claim 64 further including means for comparing the result d.sub.2 with a threshold value .theta..sub.2x on said second attempt, where d'.sub.2 is an average of d.sub.1 and d.sub.2.
- 67. The apparatus of claim 66 further including means for comparing d'.sub.2 with a threshold .theta..sub.2x which is different than .theta..sub.1 or .theta..sub.2.
- 68. The apparatus of claim 67 wherein said threshold value .theta..sub.2x is given by .theta..sub.1 <.theta..sub.2x <.theta..sub.2.
- 69. The apparatus of claim 64 further including means for comparing a numerical result d.sub.3 with a threshold value .theta..sub.1x on a third attempt, where .theta..sub.1x is less than .theta..sub.1 or .theta..sub.2.
- 70. The apparatus of claim 69 further including means for comparing a result d.sub.3 ' defined by an average of d.sub.3 and d.sub.2 nwith a threshold .theta..sub.2x.
- 71. The apparatus of claim 69 further including means for comparing a result d.sub.3 " defined by an average of d.sub.1 and d.sub.3, with a threshold .theta..sub.2x.
- 72. The apparatus of claim 69 further including means for comparing a result d.sub.3 '" defined by an average of d.sub.1, d.sub.2 and d.sub.3, with a threshold .theta..sub.3.
- 73. The apparatus of claim 72 further including means for defining .theta..sub.2x as being greater than .theta..sub.1.
- 74. The apparatus of claim 73 and further including means for defining .theta..sub.2x as being greater than .theta..sub.1x.
- 75. The apparatus of claim 72 further including means for defining .theta..sub.3 as being greater than .theta..sub.1 and .theta..sub.2.
- 76. Apparatus for providing personnel access control utilizing a speech utterance to verify an identity of a person, comprising:
- a processor for forming a reference template of parameters characteristic of the speaker's speech;
- a prompter for prompting the speaker to enter a fixed text statement;
- a prompter for prompting the person to enter a variable text statement comprising a number of words which are randomly arranged, and which arrangement is generally unknown to the speaker;
- a converter for converting the speech signals of said fixed text statement and the speech signals of said variable text statement voiced by the person into test parameters; and
- a processor for processing said test parameters with said reference template parameters to derive a result indicative of a match or mismatch therebetween and for allowing access to the person if a match is found and for denying access to the person if a mismatch is found.
- 77. The apparatus of claim 76 wherein said variable text statement is derived by a selector means for selecting a digit sequence having minimal coarticulation effects between the words thereof.
- 78. The apparatus of claim 77 further including means for forming a reference template associated with said variable text statement using a first sequence of digits comprising "1,0,3,5,8" and a second sequence of digits comprising "9,7,2,4,6".
- 79. The apparatus of claim 76 further including means for forming a variable text reference template using a specific sequence of words, and means for ramdomly arranging said words for prompting the person, and means for comparing parameters of the randomly arranged words spoken by the person with said variable text reference template.
- 80. The apparatus of claim 79 further including means for forming a variable text reference template for each word of the sequence.
- 81. The apparatus of claim 80 further including means for forming said variable text reference statement using said specific sequence of digits voiced by a population of persons, and means for deriving speaker-independent digit templates for each said word.
- 82. The apparatus of claim 81 further including means for forming said variable text reference template by capturing an utterance of the specified words by a person to be enrolled, means for transforming the utterance into a digital field, and means for processing said digital field with the variable text reference template to find a minimum Euclidean distance therebetween, and means for forming a new template for each word of the variable text phrase using said minimum distance.
- 83. The apparatus of claim 82 further including means for forming said new template by scanning each digit field of the person to be enrolled by the speaker-independent template to form a speaker dependent template.
Government Interests
This invention was made with Government support under Contact No. F30602-84-C -0030 awarded by the Department of the Air Force. The Government has certain rights in this invention.
US Referenced Citations (10)