This application claims the priority benefit of Taiwan application serial no. 94102596, filed on Jan. 28, 2005. All disclosure of the Taiwan application is incorporated herein by reference.
1. Field of Invention
The present invention relates to a method and apparatus for constructing new Chinese words by voice input. More particularly, the present invention relates to a method and apparatus for constructing new words by speaker-independent voice input, to a speaker-independent Chinese speech recognition system.
2. Description of Related Art
Speech recognition is a hot research and business issue. In speech recognition, feature parameters are extracted from the voice input and then compared with patterns in database. The patterns with high possibility are determined and output. However, speech recognition systems often encounter addition of new words. There are two kinds of systems for adding new words in Mandarin speech recognition, keyboard-strokes-based systems and training-based systems.
Although there are existing ways for adding new words, there are still no speaker-independent systems which add new words by purely voice input. Key strokes or voice feature collections are still needed.
A method and apparatus for constructing new Chinese words by voice input, to a speech recognition system, for example, a speaker-independent Chinese speech recognition system, for updating its vocabulary database are provided. A user-friendly interface is provided in adding new Chinese words.
In one embodiment of the invention, a method and apparatus for constructing new Chinese words by voice input are provided. A Chinese word consists of several Chinese characters/syllables. Voice signals indicating the Chinese characters/syllables are input sequentially, and feature parameters are derived from the voice signals. The feature parameters are compared with a description constraint unit to determine corresponding characters or syllables. The characters or syllables, confirmed by the user, are stored in a storage unit. After all characters/syllable are input and confirmed by the user, the characters or syllables are combined into a new word.
Besides, an interface provided by the invention is user-friendly and speaker-independent.
It is to be understood that both the foregoing general description and the following detailed description are exemplary, and are intended to provide further explanation of the invention as claimed.
The accompanying drawings are included to provide a further understanding of the invention, and are incorporated in and constitute a part of this specification. The drawings illustrate embodiments of the invention and, together with the description, serve to explain the principles of the invention.
Reference will now be made in detail to the present preferred embodiments of the invention, examples of which are illustrated in the accompanying drawings.
The voice input unit 300, for example a microphone, receives voice signals from a user and converts into digital signals. The feature extractor 302 extracts feature parameters (or feature vectors) from the digital voice signals and outputs the feature parameters to the speech recognition module 304. The description constraint unit 306 includes acoustic models, lexical models, and language models. The speech recognition module 304 compares the feature parameters with the description constraint unit 306 to output possible result(s) to the character/syllable confirmation unit 308.
The character/syllable confirmation unit 308 displays possible result(s) to the users, and then the user decides whether there is a desired result. If yes, the desired result is stored into the partial storage unit 310. After character(s) in a new Chinese word are confirmed and stored in the partial storage unit 310, the character/syllable confirmation unit 308 informs the combination unit 312 to combine character(s) into a new Chinese word.
If the user rejects outputs from the character/syllable confirmation unit 308, then the user may try another description of the character/syllable into the voice input unit 300 for speech recognition and character/syllable combination. Or, if the user decides to give up establishment of Chinese new words, the partial storage unit 310 is reset.
In step 400, the user describes the character/syllable, for example, by speaking a well-known phrase or word (for example, in speaking the Zhuyin spelling or speaking the Pinyin spelling (t-a-i-2).
It will be apparent to those skilled in the art that various modifications and variations can be made to the structure of the present invention without departing from the scope or spirit of the invention. In view of the foregoing descriptions, it is intended that the present invention covers modifications and variations of this invention if they fall within the scope of the following claims and their equivalents.
Number | Date | Country | Kind |
---|---|---|---|
94102596 | Jan 2005 | TW | national |