Voice recognition method by analyzing syllables

Information

  • Patent Grant
  • 5129000
  • Patent Number
    5,129,000
  • Date Filed
    Wednesday, December 12, 1990
    34 years ago
  • Date Issued
    Tuesday, July 7, 1992
    32 years ago
Abstract
A voice recognition system is disclosed which has incorporated therein information on the phonological effects on syllables. The system receives a voice signal and tentatively identifies syllable arrays and provides a collection of data on syllables in the arrays. The data are used to generate hypothetical syllable arrays from the tentatively identified arrays. The hypothetical arrays are evaluated via arithmetic operations, taking into consideration the effects of context, the speaker's habit and dialect, thereby determining a reliable representation of the input voice signal.
Description
Claims
  • 1. A voice recognition method by analyzing syllables, said method comprising the steps of
  • comparing an input voice signal with standard syllable patterns to thereby extract candidate syllables sequentially from said input voice signal, each of said candidate syllables having a reliability score associated therewith,
  • generating hypothetical syllable arrays according to a predefined conversion rule from standard syllable arrays output from memory, each of said hypothetical syllable arrays being generated by modifying one of said standard syllable arrays and assigning a penalty value indicative of allowability of modification in said hypothetical syllable array, and
  • comparing said candidate syllables with said hypothetical syllable arrays and thereby selecting one of said hypothetical syllable arrays by computing evaluation scores based on some of said penalty values,
  • wherein said hypothetical syllable arrays are generated from corresponding one of said standard syllable arrays by incorporating examples of commonly committed variations of pronunciation of said standard syllable arrays and said penalty values are determined according to allowability of each of said commonly committed variations of pronunciation.
  • 2. The method of claim 1 wherein said hypothetical syllable arrays are generated from corresponding ones of said standard syllable arrays by incorporating both examples of common syllables which sound like and are therefore often confused and examples of commonly committed variations of pronunciation of said standard syllable arrays and said penalty values are determined according to severity of confusion and allowability of variation of pronunciation.
  • 3. A voice recognition method by analyzing syllables, said method comprising the steps of
  • comparing an input voice signal with standard syllable patterns to thereby extract candidate syllables sequentially from said input voice signal, each of said candidate syllables having a reliability score associated therewith,
  • generating hypothetical syllable arrays according to a predefined conversion rule from standard syllable arrays output from memory, each of said hypothetical syllable arrays being generated by modifying one of said standard syllable arrays and assigned a penalty value indicative of allowability of modification in said hypothetical syllable array, and
  • comparing said candidate syllables with said hypothetical syllable arrays and thereby selecting one of said hypothetical syllable arrays by computing evaluation scores based on some of said penalty values,
  • wherein said hypothetical syllable arrays are generated from corresponding ones of said standard syllable arrays by incorporating examples of common syllables which sound alike and are therefore often confused and said penalty values are determined according to severity of confusion.
  • 4. The method of claim 3 wherein said hypothetical syllable arrays are generated from corresponding ones of said standard syllable arrays by incorporating both examples of common syllables which sound alike and are therefore often confused and examples of commonly committed variations of pronunciation of said standard syllable arrays and said penalty values are determined according to severity of confusion and allowability of variation of pronunciation.
Priority Claims (3)
Number Date Country Kind
61-78818 Apr 1986 JPX
61-78819 Apr 1986 JPX
61-78820 Apr 1986 JPX
BACKGROUND OF THE INVENTION

This is a continuation of application Ser. No. 07/568,547 filed Aug. 16, 1990, to be abandoned, which is a continuation of application Ser. No. 07/391,685 filed Aug. 9, 1989 now abandoned, which is a continuation of application Ser. No. 07/034,070 filed Apr. 2, 1987, now abandoned. This invention relates to a voice recognition system for use, for example, with a voice word processor which can recognize input voice signals and display their contents on a cathode ray tube or the like. Voice signals corresponding to a continuously delivered speech are difficult to analyze because vowels and consonants rarely appear in standard forms. Some groups of letters are often contracted or even omitted entirely, depending on the context, the speaker's mannerisms, dialect, etc. In addition, there are many words derived from a foreign language for which there is yet no commonly accepted way of pronunciation. According, in order to correctly identify a word, a phrase or a sentence an input voice signal, as many phonological variations from standard forms as possible should be taken into consideration and be incorporated into a voice recognition system. With a conventional voice recognition system, however, words, phrases and sentences are often represented in phonologically fixed forms. Thus, correct interpretations cannot be expected unless syllable arrays are accurately pronounced. It is therefore an object of the present invention to eliminate the aforementioned disadvantage of prior art voice recognition systems by providing a new system to which knowledge of phonological effects on syllables is incorporated so that it can be actively utilized for voice recognition. The system of the present invention tentatively identifies syllable arrays from a received voice signal and provides a collection of data on syllables in the arrays. This collection of data is used to generate hypothetical syllable arrays from the tentatively identified arrays and certain arithmetic operations are performed according to a predefined rule to evaluate these hypothetical syllable arrays. In generating such hypothetical syllable arrays, effects of the context, the speaker's habit and/or dialect are taken into consideration.

US Referenced Citations (8)
Number Name Date Kind
4060694 Suzuki et al. Nov 1977
4100370 Suzuki et al. Jul 1978
4581756 Togawa et al. Apr 1986
4590605 Hataoka et al. May 1986
4625287 Matsuura et al. Nov 1986
4665548 Kahn May 1987
4723290 Watanabe et al. Feb 1988
4783802 Takebayashi et al. Nov 1988
Foreign Referenced Citations (1)
Number Date Country
585995 Apr 1983 JPX
Non-Patent Literature Citations (1)
Entry
Nakatsu et al., "An Acoustic Processor in a Conversational Speech Recognition System", Rev. Elect. Comm. Labs., vol. 26, No. 11-12, pp. 1486-1504 Nov. 1978.
Continuations (3)
Number Date Country
Parent 568547 Aug 1990
Parent 391685 Aug 1989
Parent 34070 Apr 1987