Voice recognition method by analyzing syllables

Information

Patent Grant
5129000

References
Source

Patent Number
5,129,000
Date Filed
Wednesday, December 12, 1990
34 years ago
Date Issued
Tuesday, July 7, 1992
32 years ago

Inventors
- Atsuo Tanaka
Original Assignees
- Sharp Corporation
Examiners
- Kemeny; Emanuel S.
- Doerrler; Michelle
Agents
- Morrison & Foerster

CPC
- G10L15/187 - Phonemic context
US Classifications
- 381 - Electrical audio signal processing systems and devices
Field of Search
- US
- 381 41-46
- 364 5135
International Classifications
- G10L906

Information

Abstract

A voice recognition system is disclosed which has incorporated therein information on the phonological effects on syllables. The system receives a voice signal and tentatively identifies syllable arrays and provides a collection of data on syllables in the arrays. The data are used to generate hypothetical syllable arrays from the tentatively identified arrays. The hypothetical arrays are evaluated via arithmetic operations, taking into consideration the effects of context, the speaker's habit and dialect, thereby determining a reliable representation of the input voice signal.

Description

Claims

1. A voice recognition method by analyzing syllables, said method comprising the steps of
comparing an input voice signal with standard syllable patterns to thereby extract candidate syllables sequentially from said input voice signal, each of said candidate syllables having a reliability score associated therewith,
generating hypothetical syllable arrays according to a predefined conversion rule from standard syllable arrays output from memory, each of said hypothetical syllable arrays being generated by modifying one of said standard syllable arrays and assigning a penalty value indicative of allowability of modification in said hypothetical syllable array, and
comparing said candidate syllables with said hypothetical syllable arrays and thereby selecting one of said hypothetical syllable arrays by computing evaluation scores based on some of said penalty values,
wherein said hypothetical syllable arrays are generated from corresponding one of said standard syllable arrays by incorporating examples of commonly committed variations of pronunciation of said standard syllable arrays and said penalty values are determined according to allowability of each of said commonly committed variations of pronunciation.
2. The method of claim 1 wherein said hypothetical syllable arrays are generated from corresponding ones of said standard syllable arrays by incorporating both examples of common syllables which sound like and are therefore often confused and examples of commonly committed variations of pronunciation of said standard syllable arrays and said penalty values are determined according to severity of confusion and allowability of variation of pronunciation.
3. A voice recognition method by analyzing syllables, said method comprising the steps of
comparing an input voice signal with standard syllable patterns to thereby extract candidate syllables sequentially from said input voice signal, each of said candidate syllables having a reliability score associated therewith,
generating hypothetical syllable arrays according to a predefined conversion rule from standard syllable arrays output from memory, each of said hypothetical syllable arrays being generated by modifying one of said standard syllable arrays and assigned a penalty value indicative of allowability of modification in said hypothetical syllable array, and
comparing said candidate syllables with said hypothetical syllable arrays and thereby selecting one of said hypothetical syllable arrays by computing evaluation scores based on some of said penalty values,
wherein said hypothetical syllable arrays are generated from corresponding ones of said standard syllable arrays by incorporating examples of common syllables which sound alike and are therefore often confused and said penalty values are determined according to severity of confusion.
4. The method of claim 3 wherein said hypothetical syllable arrays are generated from corresponding ones of said standard syllable arrays by incorporating both examples of common syllables which sound alike and are therefore often confused and examples of commonly committed variations of pronunciation of said standard syllable arrays and said penalty values are determined according to severity of confusion and allowability of variation of pronunciation.

Priority Claims (3)

Number	Date	Country
61-78818	Apr 1986	JPX
61-78819	Apr 1986	JPX
61-78820	Apr 1986	JPX

BACKGROUND OF THE INVENTION

This is a continuation of application Ser. No. 07/568,547 filed Aug. 16, 1990, to be abandoned, which is a continuation of application Ser. No. 07/391,685 filed Aug. 9, 1989 now abandoned, which is a continuation of application Ser. No. 07/034,070 filed Apr. 2, 1987, now abandoned. This invention relates to a voice recognition system for use, for example, with a voice word processor which can recognize input voice signals and display their contents on a cathode ray tube or the like. Voice signals corresponding to a continuously delivered speech are difficult to analyze because vowels and consonants rarely appear in standard forms. Some groups of letters are often contracted or even omitted entirely, depending on the context, the speaker's mannerisms, dialect, etc. In addition, there are many words derived from a foreign language for which there is yet no commonly accepted way of pronunciation. According, in order to correctly identify a word, a phrase or a sentence an input voice signal, as many phonological variations from standard forms as possible should be taken into consideration and be incorporated into a voice recognition system. With a conventional voice recognition system, however, words, phrases and sentences are often represented in phonologically fixed forms. Thus, correct interpretations cannot be expected unless syllable arrays are accurately pronounced. It is therefore an object of the present invention to eliminate the aforementioned disadvantage of prior art voice recognition systems by providing a new system to which knowledge of phonological effects on syllables is incorporated so that it can be actively utilized for voice recognition. The system of the present invention tentatively identifies syllable arrays from a received voice signal and provides a collection of data on syllables in the arrays. This collection of data is used to generate hypothetical syllable arrays from the tentatively identified arrays and certain arithmetic operations are performed according to a predefined rule to evaluate these hypothetical syllable arrays. In generating such hypothetical syllable arrays, effects of the context, the speaker's habit and/or dialect are taken into consideration.

US Referenced Citations (8)

Number	Name	Date
4060694	Suzuki et al.	Nov 1977
4100370	Suzuki et al.	Jul 1978
4581756	Togawa et al.	Apr 1986
4590605	Hataoka et al.	May 1986
4625287	Matsuura et al.	Nov 1986
4665548	Kahn	May 1987
4723290	Watanabe et al.	Feb 1988
4783802	Takebayashi et al.	Nov 1988

Foreign Referenced Citations (1)

Number	Date	Country
585995	Apr 1983	JPX

Non-Patent Literature Citations (1)

Entry
Nakatsu et al., "An Acoustic Processor in a Conversational Speech Recognition System", Rev. Elect. Comm. Labs., vol. 26, No. 11-12, pp. 1486-1504 Nov. 1978.

Continuations (3)

	Number	Date
Parent	568547	Aug 1990
Parent	391685	Aug 1989
Parent	34070	Apr 1987

Voice recognition method by analyzing syllables

Information

Patent Number

Date Filed

Date Issued

Inventors

Original Assignees

Examiners

Agents

CPC

US Classifications

Field of Search

US

International Classifications

Abstract

Description

Claims

Priority Claims (3)

BACKGROUND OF THE INVENTION

US Referenced Citations (8)

Foreign Referenced Citations (1)

Non-Patent Literature Citations (1)

Continuations (3)