Recognition confidence measuring by lexical distance between candidates

Description

BRIEF DESCRIPTION OF THE DRAWINGS

These and/or other aspects, features, and advantages of the present invention will become apparent and more readily appreciated from the following detailed description, taken in conjunction with the accompanying drawings of which:

FIG. 1 illustrates an example of extracting a candidate in a confidence measurement system according to a conventional art;

FIG. 2 is a configuration diagram illustrating a recognition confidence measurement system according to an exemplary embodiment of the present invention;

FIG. 3 is a flowchart illustrating a method of detecting a feature vector from an input speech signal by a phoneme string extraction unit according to an exemplary embodiment of the present invention;

FIGS. 4 and 5 are flowcharts illustrating an example of estimating a phoneme confusion matrix according to an exemplary embodiment of the present invention;

FIG. 6 is a flowchart illustrating a recognition confidence measurement method according to another exemplary embodiment of the present invention; and

FIG. 7 is a schematic diagram illustrating an example of a recognition confidence measurement method according to still another exemplary embodiment of the present invention.

Claims

1. A recognition confidence measurement method comprising: extracting a phoneme string from a feature vector of an input speech signal;extracting candidates by matching the extracted phoneme string and phoneme strings of vocabularies registered in a predetermined dictionary;estimating a lexical distance between the extracted candidates; anddetermining whether the input speech signal is an in-vocabulary, based on the lexical distance.
2. The method of claim 1, wherein the extracting of the phoneme string extracts an optimum phoneme string according to each language by using a Hidden Markov Model (HMM) and a predetermined phoneme grammar for each language.
3. The method of claim 1, wherein the extracting of the candidates comprises: calculating a similarity between the extracted phoneme string and the phoneme strings of the vocabularies; andextracting the candidates based on the calculated similarity.
4. The method of claim 3, wherein the calculating of the similarity comprises estimating a phoneme confusion matrix.
5. The method of claim 4, wherein estimating the phoneme confusion matrix comprises: allocating an initial value to a distance of phoneme-by-phoneme; andperforming a phoneme recognition using a training database.
6. The method of claim 5, wherein the performing of the phoneme recognition comprises: performing a dynamic matching with respect to a result of the phoneme recognition and a phoneme string corresponding to vocabularies of the training database;estimating an optimum matching pair by back tracking; andestimating a number of matchings of the phoneme-by-phoneme and updating the distance.
7. The method of claim 4, wherein f estimating the phoneme confusion matrix comprises: estimating a continuous HMM or a semi-continuous HMM for each phoneme by using a training database; andestimating a distance of phoneme-by-phoneme.
8. The method of claim 7, wherein the estimating of the distance comprises: utilizing a Bhattacharya distance, in the case of the continuous HMM; andestimating an amount of information loss, in the case of the semi-continuous HMM.
9. The method of claim 1, wherein the estimating of the lexical distance comprises: selecting a pair of candidates from the extracted candidates;performing a dynamic matching of the selected pair of candidates;calculating a score for the pair of candidates; andestimating the lexical distance using the calculated score.
10. The method of claim 9, wherein the calculating of the score calculates the score using a predetermined phoneme confusion matrix.
11. The method of claim 9, wherein the determining whether the input speech signal is in-vocabulary comprises: determining the input speech signal as in-vocabulary, when the calculated score satisfies a set numerical value; anddetermining the input speech signal as an out-of-vocabulary, when the calculated score does not satisfy the set numerical value.
12. The method of claim 9, wherein the determining whether the input speech signal is in-vocabulary comprises: utilizing a predetermined weight for the calculated score to correct the calculated score.
13. The method of claim 11, wherein the determining the input speech signal as out-of-vocabulary comprises: performing a rejection due to a recognition error with respect to the input speech signal.
14. A computer readable storage medium storing a program for implementing a recognition confidence measurement method comprising: extracting a phoneme string from a feature vector of an input speech signal;extracting candidates by matching the extracted phoneme string and phoneme strings of vocabularies registered in a predetermined dictionary;estimating a lexical distance between the extracted candidates; anddetermining whether the input speech signal is an in-vocabulary, based on the lexical distance.
15. A recognition confidence measurement system comprising: a phoneme string extraction unit extracting a phoneme string from a feature vector of an input speech signal;a candidate extraction unit extracting candidates by matching the extracted phoneme string and phoneme strings of vocabularies registered in a predetermined dictionary;a distance estimation unit estimating a lexical distance between the extracted candidates; anda registration determination unit determining whether the input speech signal is an in-vocabulary, based on the lexical distance.
16. The system of claim 15, wherein the phoneme string extraction unit extracts an optimum phoneme string according to each language by using a Hidden Markov Model (HMM) and a predetermined phoneme grammar for the each language.
17. The system of claim 15, wherein the candidate extraction unit calculates a similarity between the extracted phoneme string and the phoneme strings of the vocabularies, and extracts the candidates based on the calculated similarity.
18. The system of claim 15, wherein the distance estimation unit performs a dynamic matching of a pair of candidates selected from the extracted candidates, calculates a score for the pair of candidates, and estimates the lexical distance using the calculated score.
19. The system of claim 18, wherein the registration determination unit determines the input speech signal as in-vocabulary, when the calculated score satisfies a set numerical value and determines the input speech signal as an out-of-vocabulary, when the calculated score does not satisfy the set numerical value.
20. The system of claim 18, wherein the registration determination unit utilizes a predetermined weight for the calculated score to correct the calculated score.
21. The system of claim 18, wherein the registration determination unit performs a rejection due to a recognition error with respect to the input speech signal.
22. A recognition confidence measurement method comprising: extracting candidates by matching a phoneme string of a speech signal and phoneme strings of vocabularies registered in a predetermined dictionary;estimating a lexical distance between the extracted candidates; anddetermining whether the speech signal is an in-vocabulary, based on the lexical distance.
23. A medium comprising computer readable instructions implementing the method of claim 22.

Priority Claims (1)

Number	Date	Country	Kind
10-2006-0012528	Feb 2006	KR	national

Recognition confidence measuring by lexical distance between candidates

Information

Publication Number

Date Filed

Date Published

Inventors

Original Assignees

CPC

US Classifications

International Classifications

Abstract

Description

Claims

Priority Claims (1)