1. Field of the Invention
The present invention relates to inputting Chinese text data into a computer.
The Unicode Standard Version 3.0, The Unicode Consortium, Addison-Wesley 1991, Reading, Mass., USA.
Inputting Chinese text data into a computer has been an intriguing problem and is technically very challenging, as evidenced by that thousands of related information items can be found on the Internet by the Google search engine, with the search keys-“Chinese input method”. In many commercial Chinese information front-end products, such as TwinBridge and UnionWay, various input methods have been included in the system to satisfy users' need.
Like in written English or many other western languages, a written Chinese paragraph consists of a string of sentences separated by punctuation symbols, and each Chinese sentence is a string of Chinese words. However, unlike in English, where each word is a string of characters from a small alphabet of size only 26, each Chinese word is a graphical pattern and tens of thousands different patterns are in use.
To input Chinese text into a computer, an encoding scheme is normally required. The scheme can be a hard-coded one, like the 4-digit telegram codes. Other schemes use either the stroke structure or the pronunciation or the mixture of the two of the Chinese words. One may consider the coding symbols of a word in an encoding scheme as an attribute or a signature of the word. To input a sentence, a user specifies the attributes of the words of the sentence. Internally the computer will calculate to find the best match. If there are multi-choices, they are presented to the user to select.
Most early Chinese input methods are word based, in the sense that a user types in the encoding of the words one by one and generate the words one by one. Many recent methods use phrase or context information to improve the accuracy and to speed up the producing of the right sentence words.
In recent years two technologies become more mature and provide new avenues other than a keyboard to input Chinese text data. One is the handwriting recognition technology. Another one is the speech recognition technology. These approaches are basically still a matching process, with the attributes of the words extracted by the computer from the traces of the writing data or the speech sampling data.
Different input methods have their technical advantages and disadvantages based on the technologies it employ. For example, the input speed of a stroke structure based encoding scheme may be very fast once a user becomes proficient of its use. The initial learning curve is normally very steep however because the user needs to learn a non-trivial new skill. An input method using keyboard device is fast and highly accurate because a keyboard is properly designed to be controlled by the fingers of both hands. People ordinarily feel natural to use input methods based on handwriting recognition or speech recognition technologies because they have already learned to speak and write in their younger years and school days. An input method based on handwriting recognition technology is natural to use but it is less accurate than using a keyboard. Writing using a pen is inherently slow, because each Chinese word requires many strokes to write.
Speech recognition is a very promising technology for Chinese text input. To speak is very natural. When people talk in distinct voice and in moderate speed, a moderate recognition rate can be achieved. But in an application domain where a large number of phrases are in use, however, the accuracy rate usually drops sharply. Furthermore, speech recognition technology is very sensitive to the working environment. It is very intrusive to others to be used in a shared office. With a noisy surrounding, the accuracy rate will also drop sharply.
A soft keyboard (or virtual keyboard) is yet another device used to input text data. The idea is to draw a keyboard on the screen so that a user can use the mouse or a pen to activate all events to simulate the real keyboard operations. The advantages of using a soft keyboard are the following. 1. A mouse or a pen is easy to move and click, quiet to operate, and it occupies only one hand. 2. The soft keyboard provides a visual user feedback which makes the operations highly accurate. 3. The soft keyboard can be used without a real keyboard and is suitable to a PDA or a tablet PC working environment.
There are drawbacks in using a soft keyboard to implement an input method, however. The difficulty comes from that a typical implementation of a soft keyboard copies onto the screen the exact layout of a real keyboard, which was designed to allow a user to type blindly using the sense of the relative positions of the fingers. On a soft keyboard, this sense does not apply. A user needs to visually search for the key every time he types a key and it slows down the soft typing. For this reason, a soft keyboard is ordinarily considered as a supplementary tool and is only used casually.
In recent years, soft keyboard with non-conventional layout design have been proposed to enter text into hand-held devices such as a PDA. To avoid confusion, we will refer to a soft keyboard with non-conventional layout design as a soft keypad or keypad. In general, an input method using a keypad requires the following four capabilities to make it suitable to use on a hand-held device:
Although most of the proposed keypad input methods emphasize their usability on hand-held devices, needless to say that they are also usable on any computer with a pointing device and a larger screen. A text input method for a PDA can be immediately applied on a tablet computer, where a pen is used as the pointing device. Even for a PC or a workstation, a keypad input method will be competitive for text input if the input process can be made efficient and easy to use. Furthermore, by the great flexible capability of graphic user interface design, screen keypads are the ideal glue to integrate various technologies to provide text input service.
To design a keypad for a specific language one needs to address the particular difficulties that language posts and take advantages of the special properties that language possesses. The objective of this invention is to provide a method to perform Chinese text input by specifying word phonetic information. In the following we first describe the chief constituents of this invention and the major issues it deals with. The details of how problems are solved will be described in later sections.
Methods and systems consistent with the present invention, as embodied and broadly described herein, provide a Zhu-Yin (BoPoMoFo) Keypad on the screen to allow a user to enter key codes. The key codes are used by the method or the system to find Chinese words or phrases, and the results are then presented to the user for further examination and selection.
Methods and systems consistent with the present invention, as embodied and broadly described herein, provide a Pin-Yin Keypad on the screen to allow a user to enter key codes. The key codes are used by the method or the system to find Chinese words or phrases, and the results are then presented to the user for further examination and selection.
Methods and systems consistent with the present invention, as embodied and broadly described herein, provide a cascade Multi-Window to present candidate words and phrases to allow a user to browse on the windows and select desired words or phrases.
Methods and systems consistent with the present invention, as embodied and broadly described herein, provide Two-Level Refining Control Windows to allow a user to browse on the control windows to enter phonetic symbol strings by efficient mouse operations.
Methods and systems consistent with the present invention, as embodied and broadly described herein, use a frequency-based scheme to select phrases to present to a user to select. Phrases are classified into four classes: the most frequently used, very frequently used, commonly used, and rarely used. The phrases in each class are presented to the user in different stages to speed up the text input process.
Methods and systems consistent with the present invention, as embodied and broadly described herein, implement a two-phase input procedure to allow a user to enter Chinese text data. The first phase is the key-in phase. The second phase is the refining phase. Both phases use a one-way scanning process on the Sentence Editing Buffer.
Methods and systems consistent with the present invention, as embodied and broadly described herein, implement an architecture where the selected phrases flow from the cascade Multi-Window to the sentence buffer and then to a Text Accumulation Window. Control valves from the cascade Multi-Window, Sentence Editing Buffer, and the Text Accumulation Window to an application program allow the selection of a data flow path from the system to an application.
This summary and the following description of the invention should not restrict the scope of the claimed invention. Both provide examples and explanations to enable others to practice the invention. The accompanying drawings, which form part of the description of the invention, show several embodiments of the invention, and together with the description, explain the principles of the invention.
In the Figures:
The following notations and conventions will be used in the descriptions of this invention.
For example, “” is pronounced as “
”, and “
” is pronounced as “
4”. The pronunciation of “
” is represented as “
4”.
The Zhu-Yin phonetic symbols will be classified into a consonant set (C-set), a transition vowel set (H-set), and a vowel set (V-set). The following are the lists of the C, H, and V sets.
Tonal symbols {0 1 2 3 4} will be referred as the T-set. The T-set can also be represented as { }.
A standard Chinese word pronunciation can be represented by a string of four symbols taken from each of the C, H, V, and T sets. For example, “” is pronounced as “
”. In some cases one or two phonetic symbol components may be missing. For example, “
” is pronounced as “
4” with the vowel component missing.
A blank symbol has been added to each of the C, H, and V sets to specify missing components.
There are about 1400 valid word pronunciations for Mandarin Chinese. Their Zhu-Yin representations can be grouped in the order of C, H, V, and T sequence and organized as a lexical structure tree, as shown in
For example, “” is pronounced at “ZHUAN1”, and “
” is pronounced as “LI4”. The pronunciation of “
” is represented as “ZHUAN1_L14”.
There is a 1-1 mapping between the set of valid Zhu-Yin phonetic symbol string and the set of valid Pin-Yin phonetic symbol string. For example, “” is mapped to “ZHUAN” and “
” is mapped to “LI”. The Pin-Yin representations can also be organized as a lexical structure tree, as shown in
The following description of embodiments of this invention refers to the accompanying drawings. Where appropriate, the same reference numbers in different drawings refer to the same or similar elements.
Methods and systems consistent with the present invention, as embodied and broadly described herein, provide a platform and a method to allow a user to input Chinese text data. The platform and the method can be used either in text generation, or in text editing, or in specifying queries in text retrieval or other application programs.
The Soft Keypad is the place where a user keys in phonetic symbol strings of the words of Chinese sentences. A user may also use it to control the selection of phrases. The cascade Multi-Window displays candidate words or phrases on buttons for a user to browse and click to select. The Sentence Editing Buffer displays the sentence that is currently being composed. The Attribute Viewing Window displays the phonetic string of a selected word in the Sentence Editing Buffer. The Text Accumulation Window is the pool to collect sentences generated by the user. The Two-Level Refining Control Window provides the mechanism to allow a user to enter key information through efficient mouse operations.
In actual use, the Two-Level Refining Control Window may show on top of the Keypad to save space, as shown in
The user selects words or phrases from the cascade browsing Multi-Window 745. The selected phrase will flow to the Sentence Editing Buffer 755, which is a window with a fixed number of keys to display and manipulate the sentence that is currently being composed. Phonetic symbol strings also flow into the Sentence Editing Buffer 755. When the Sentence Editing Buffer is full, or a punctuation symbol has been entered, the word strings in the buffer will flow to the Text Accumulation Window 725.
There are data flow control valves 715, 720, and 750 that control the data flow from the Cascade Browsing windows, Sentence Editing Buffer, and the Text Accumulation Window to an application program 710. The control valves can be opened and closed by control buttons.
The R-1 Refining Control window panel 731 is controlled by the Soft Keypad 740 in the sense that its content is determined by the phonetic symbol strings entered from the Soft Keypad. Similarly, the R-2 Refining Control window panel 732 is controlled by the R-1 Refining Control window panel 731. The content of the Cascade Multi-Window 745 is changed dynamically according to information keyed in by the mouse operations on the Soft Keypad 740, the R-1 Refining window panel 731, and the R-2 Refining window panel 732.
Zhu-Yin Keypad
” to “
”, five tonal keys, three blank keys
for the C, H, and V sets, and several function keys.
The keypad design is based on the following considerations:
The partitions of the keys into sections and groups, the order of groups within each section, and the order of keys within each group provide a user with a simple sense of the locations of the keys. This sense, together with that the number of keys has been much reduced; enable a user to find a key on the keypad at a glance.
Pin-Yin Keypad
The keypad design is based on the following considerations:
The grouping of keys, the order of groups on the Keypad, and the sequence order of keys within each group provide a user with a simple sense of locations of the keys. This sense, together with that the number of keys is much reduced, enable a user to find a key on the keypad at a glance.
Function Keys
The function keys used in this system are described below. Details of some of their functions will be further explained in later sections.
When using a Chinese input method to generate text data, often a set of phrases would need to be presented to the user for selection. A traditional design is to present the phrase candidates in a small one-dimensional window. Assuming that a one-dimensional window can fit 10 candidate phrases, the system will first present 10 candidate phrases. If the user cannot find the desired phrase among the first 10, he needs to use a control button to get the next 10 for examination, and so on so forth. This process becomes very tedious and hard to use when there are a large number of candidates to examine.
A two-dimension window is sometimes used for phrase presentation in existing systems. In this design, phrases are fitted row by row into a rectangular window. If the user cannot find the desired phrase in the window, he can use a control button to load the next group of candidates into the window for examination. Ordinarily the width of a two-dimensional window is much smaller than that of a one-dimensional window to avoid blocking the application program. Therefore, extra mouse clicking operations are still required to find a phrase.
The phrase searching task can be made much more efficient and easier by the cascade windows 1010 shown in
This design of the Multi-Window has the following advantages.
1. The Multi-Window extends the traditional one-dimension or two-dimension windows for phrase presentation to three-dimensional; in the sense that it contains two-dimension windows and it also has a third dimension, the depth. This makes it capable of presenting a great number of candidates. Assuming that each window page of the Multi-Window is of size 10 (words) by 10 (words), and that phrases are fitted into the page from left to right, and then from top to bottom without wrapping around in the phrases,
2. At any time the window page at the front is surrounded by an L-shaped and an inversed L-shaped exposed portions of its neighboring pages. When the mouse cursor moves across these L-shaped or inversed L-shaped regions, the windows underneath will be brought to the front. This provides an easy way to browse the window pages sequentially in both the ascending direction and descending direction of the cascade, with only the mouse move operation.
3. Every window page that is not at the front has an L-shaped or inversed L-shaped exposed portion seen by the user. This gives the user a sense of what is contained in a covered page.
When the words within a sentence are selected and determined, they are shown on the tops of the keys of the Sentence Editing Buffer. In case the phonetic symbol string is not enough to make a good guessing, the first phonetic symbol of each word will be shown on top of the keys.
For example, the phonetic symbol strings of “” is “SHEN1_QING3_ZHUAN1_LI4”. If “
” is entered as a phrase selected from the Multi-Window, and “
” is keyed in as a series of phonetic symbol strings “ZH LI”, the character string “
” will be shown in the Sentence Editing Buffer, where “Z” and “L” are temporarily used to represent the two words “
” and “
”.
The input method of this invention distinguishes two types of key-in modes—Manual-Control Key-in mode and Automatic-Firing key-in mode. These key-in mode are selected by the function button.
Manual-Control Key-In Mode
1. Zhu-Yin System:
In the Manual-Control key-in mode, the user manually selects the word of current focusing by clicking the mouse on the keys in the Sentence Editing Buffer. He will have full control of selecting the C, H, V, and T components of the word of focusing. He can select and de-select the phonetic symbols by clicking the keys in the C, H, V, and T sections.
2. Pin-Yin System:
In the Manual-Control key-in mode, the user continuously key-in the phonetic symbol string of a word by clicking the phonetic keys and tonal keys in the keypad. The system will automatically advance to the next word only at the end of a phonetic symbol string that is not the leading string of other valid phonetic symbol strings. For example, after the user entered “BIN”, the system will not advance to the next word because “BIN” is the leading string of another valid phonetic symbol string “BING”. After the user keyed in “BING”, however, the system will advance to the next word because it does not subsume any other valid phonetic symbol strings. The user can use the function button to interrupt the keying of the current string and advance to the next word.
Automatic-Firing Key-In Mode
In the Automatic-Firing key-in mode, the user will use a specially designed sequence of mouse operations called PTR operation to enter the phonetic symbol string of a word, where PTR represents the sequence of three mouse operations 1) press a first key. 2) touch a second key. 3) release on a third key. At the end of a PTR operation, the system will automatically advance to the next word in the Sentence Editing Buffer and wait for the user's next PTR mouse operation. In all the discussions that follow, the system will be set in Automatic-Firing Key-in mode.
A special case of the Automatic-Firing mode is called the Rapid-Firing mode in which the user only enters the first phonetic symbol of each word of a sentence during the key-in phase of the Two-Phrase Sentence Generation Procedure.
The input method of this invention allows a user to key-in either the complete phonetic symbol string representation of a word, or a partial heading string of the complete string. By entering a phonetic symbol string of a word into the system, a user actually specifies the phonetic components of the pronunciation of that word, which in turn constrain the set of valid candidate words to choose. For example, as shown in 1” has been entered, the set of possible Chinese words are {
,
, . . . ,
}. However, if only “
”has been specified, the set of possible Chinese words will be those words that have its C, H, and V components matched with the symbols “
”, “
”, and “
” respectively. That set is all the Chinese words contained in the sub-tree of the branch of “
”, shown in
Press-Touch-Release (PTR) Mouse Operation
A Press-Touch-Release (PTR) sequence of mouse operations has been designed to allow a user to efficiently select the C, H, and V phonetic components of a Chinese word. A standard PTR sequence consists of the following mouse operations. 1. Press a consonant key to select the consonant symbol. 2. Move the cursor to touch a transition vowel key to select a transition vowel symbol. 3. Release the mouse on a vowel key to select a vowel symbol.
For example, to select phonetic symbols “”, “
”, and “
” (
” in the keypad. 2. Moves mouse cursor to touch the key “
” to select. 3. Moves the cursor to the key “
” and releases the mouse button for its selection. Conceptually, a PTR operation traverses on level-1 to level-3 on a lexical structure tree (
A user can specify only the leading portion of a phonetic string by releasing the mouse button on keys in the C section or the H section. For example, if the user presses the key “”, then touches “
”, and then releases the mouse button, he has effectively entered the string “
”, which is the leading string of “
”.
Some pronunciations may have the C or H components missing. In those cases, an implication rule is useful to fill in blanks. For example, if the user at the beginning of a PTR operation presses the key “”, which is a symbol in the H-set, the C component must be a blank. Similarly, if the user presses the key “
” at the beginning of a PTR operation, which is a symbol in the V-set, both the C and V components must be blanks.
The grouping and the placement of the phonetic keys on the keypad (
The system provides the following two further helps to make the PTR operation even simpler.
The PTR operation can be made completely retractable at selecting the C, H, and V components of the pronunciation of a word, as shown in the flow chart of
When the mouse button is released on a valid C or H or V key, one PTR operation is finished. At that instant, the R-1 and R-2 will disappear and the Keypad will be reset to its original state.
Entering a Tonal Symbol
After a PTR operation, if the user desires, he can select a tonal symbol by touching the cursor with the tonal key (
Entering a Chinese Word
The cascade Multi-Window will also show matched Chinese word candidates of the current focusing in the Sentence Editing Buffer if the C, H, and V components entered for the word of focusing form a valid phonetic symbol string. The user can move the mouse cursor to the Multi-Window and press on one of the candidate word. He effectively selects the Chinese word. The Multi-Window will recalculate to show only phrases that also match with this selected Chinese word. With the mouse button still down, the user can browse the new set of phrase candidates and release the button on a desired phrase to select it.
Mouse operations on the Pin-Yin Keypad can be designed similar to that for Zhu-Yin system although the lexical structures of the two systems are somewhat different (
String Partition
There is a 1-1 correspondence between the set of the phonetic symbol string in Pin-Yin system and that of the Zhu-Yin system. To implement the PTR operation for the Pin-Yin system, we partition every Pin-Yin phonetic symbol string into three segments—first string (ζ), second string (a), and third (tail) string (T). They are described below.
For example, the set of descendants of the first string “B” is {A, E, I, 0, U}, indicating that one of the symbol of {A, E, I, 0, U} may follow the string “B” in a phonetic symbol string in the Pin-Yin system. Therefore, {A, E, I, 0, U} is the set of the second strings of “B”.
An exception to the above rule is that if symbol “a” is a first string, and string “ab” is also a first string, then symbol “b” is excluded from the set of second strings of string “a”. For example, since both “Z” and “ZH” are first strings, “H” is excluded from the set of the second strings of “Z”. Therefore, the set of the second strings of “Z” is {A, E, I, O, U} (
With the proper choosing of the first strings in statement 1, most first strings have {A, E, I, O, U} as its set of the second strings. In case the fist string itself is already a valid phonetic string, a blank symbol will be added to the set of the second strings. For example, “JU” can be followed by one of the symbol of {A, E, N}. Since “JU” itself is also a valid phonetic string, the set of the second strings of “JU” is {A, E, N, □}.
The maximum size of the set of the second strings of a first string in the Pin-Yin system is 6.
For example, the three strings that can follow the first string “JI” and the second string “A” are “NG”, “AN”, and “AO”. “JIA” itself is also a valid string. Therefore, the set of the third strings for “JI” and “A” is {NG, AN, AO, □}, where □ represents a blank third string.
The maximum size of the set of the third strings is 9 in the Pin-Yin system.
Press-Touch-Release (PTR) Mouse Operation
A user can also apply the Press-Touch-Release (PTR) sequence of mouse operations on the Pin-Yin keypad. A standard PTR sequence consists of the following mouse operations.
Conceptually, a PTR operation also traverses on the lexical structure tree of the Pin-Yin system (
As with the Zhu-Yin case, the PTR operation can also be made completely retractable at selecting the heading symbol of ζ, and selecting σ, and τ strings, as shown by the flow chart in
A Frequency-Based Phrase Classification Strategy
Phrases that match the phonetic information keyed in by a user are collected from the system phrase tables. They are presented in the Multi-Window for the user to browse and select. This is done in both the text key-in phase and the editing phase of the Two-Phase Sentence Generation Procedure.
A frequency-based classification strategy is utilized in the design of the phrase selection.
The most frequently used phrase set is the default phrase set to be displayed in the Multi-Window. The very frequently used phrases are classified according to the first symbol of the phonetic symbol string of the first word of the phrases. When the user moves the mouse cursor over a key on the Keypad, the subset of the very frequently used phrases associated with that key will be shown in the Multi-Window. The user can move the cursor on the Keypad to preview the very frequently used phrases associated with each key. When he sees the desired phrase set, he needs to press that key, to hold the current phrase set in the Multi-Window. He can then moves the cursor to the Multi-Window for browsing and selection.
For example, assume that the system is now in the Key-In mode. When the user moves the mouse cursor over the key “”, all the very frequently used phrases that are associated with “
” will be shown in the Multi-Window. When the user moves the mouse cursor off key “
”, he will see the default most frequently used phrases in the Multi-Window again.
When the phonetic information of more than one word has been entered, normally a few pages of the Multi-Window will be sufficient to display all the matched commonly used phrases, even in the natural Chinese text writing application domain. The more information entered, the smaller will be the matched phrase set. The Multi-Window is designed to display longer phrases first, followed by shorter ones. This is because that a longer matched phrase will have a better chance to be the one that the user desires. ”, “
”, and “
” of the words of a phrase. For example, the Zhu-Yin phonetic symbol strings of the phrase “
” 2710 is “
”. It matches with the three C component symbols “
”. On the other hand, the Zhu-Yin phonetic symbol strings of the phrase “
” 2720 is “
”. It also matches with the first two symbols “
”. Both of these two phrases are shown in the cascade Multi-Window of
The design philosophy of “Phrases that are more frequently used should require less effort to find,” has been applied here. The most-frequently used phrases are the default phrase set so that a user can go directly to the Multi-Window to find them without any mouse operations on the Keypad. He needs to browse and press a key on the Keypad, and then goes to the Multi-Window for a very frequently used phrase. To enter a phrase beyond the most frequently used and very frequently used, the user needs to key in the phonetic symbol strings of more than one word.
This invention provides a flexible text key-in procedure to allow a user to key in Chinese text data by words, by phrases, or by sentences. This text key-in process will be described by referring to a diagram (” has been shown in
” and “
” are two phrases provided in the system phrase table but not “
”.
Key in a Word by Phonetic Information
Here we show how to key in a Chinese word by specifying its phonetic symbol string.
The system will always have a focus word in the Sentence Editing Buffer in the text generating process. During the key-in phase, the focus word is the last word of the current sentence being composed in the Sentence Editing Buffer. During the editing phase, the focus word is determined by the user where he intends to resume the phonetic information entering task.
The system will show Chinese word candidates at the focusing point once and only when the phonetic symbol string entered at the focus word location represents a valid phonetic symbol string. For example, if “” is the string entered, the Multi-Window will not show any words corresponding to this string because “
” is not a valid phonetic symbol string. On the other hand, if “
” is the string that has been entered, the Multi-Window will show the set of all words that pronounced as “
” with any tone, but not any word with an additional vowel, such as “
” of “
”.
“” can be keyed in word-by-word in the steps shown in
For example, the user uses the following steps in ”.
” and “
” to compose the sentence “
” by specifying its phonetic symbol strings. Again a PTR operation is used to specify a leading phonetic symbol string for each word in the sentence.
For example, the user uses the steps including the following in ”
Comparing
Key in Sentence-by-Sentence
Since every Chinese sentence is a string of words and phrases, a person who is proficient in either Zhu-Yin phonetic system or Pin-Yin phonetic system should be able to key in words and phrases to compose sentences by specifying its phonetic information. In the process, he may still encounter the following two problems:
For example, assume that the system phrase table contains all the three phrases “”, “
”, and “
”, and that the user wants to key in a sentence containing the word string “
”. If the user starts looking for the phrase “
” after he keyed in the phonetic information for the two word “
” and “
”, he will miss the opportunity that less information is needed to key in to get the combined phrase “
”.
The Two-Phase Sentence Generation Procedure is designed in this invention to deal with the above two problems. A user will iteratively go through a Key-in phase and an Editing phase to generate sentences. In the Key-in phase, the user sequentially keys in Chinese words, phrases to compose the sentence. The words and phrases may be selected from the most frequently used and very frequently used sets. Leading strings of valid phonetic symbol strings can be entered as placeholders for those words. In the Editing phase, PTR operations can be resumed on words where Chinese character information has not been designated yet.
”. In the example the following assumptions are made for the purpose of showing various input situations.
Rapid-Firing key-in strategy will be used in the example, i.e., the user will click to key in the first phonetic symbol of each word of the sentence in the key-in phase.
Explanation of Each Step in
Steps 1 to 9 apply the Rapid-Firing key-in strategy and specify the first phonetic symbol of each word in the sentence. It repeatedly go through the loop of 3130 and 3120 in
Here it is assumed that “” has been shown on keys #1 to #4 in the Sentence Editing Buffer and it is the unique longest phrase matches with the sequence of heading strings “
”. This action tells the system that the words in the entries #1 to #4 are already the desired ones, and the PTR operation can resume at word 5.
In the following we summarize the properties of the Two-Phase Sentence Generation Procedure.
1. The keypads and key panels are designed based on the lexical structure of the symbol strings of the Zhu-Yin and Pin-Yin phonetic systems. The design allows easy key locating and efficient mouse operation for entering phonetic information.
2. A specially designed window called Multi-Window with multi window pages is used for candidate words or phrases presentation. The multi pages can present a great many words and phrases. The special layout design of the multi pages and the functionality implemented allows a user to browse the pages sequentially in both the ascending and descending direction without mouse clicking operations.
3. A five-step refinement scheme is designed to allow a user to adaptively refine his specification of a word by phonetic symbol string (
4. The Two-Phase Sentence Generation Procedure is frequency-based. System provided phrases are classified into most frequently used, very frequently used, commonly used, and rarely used classes. The design philosophy of “Phrases that are more frequently used should require less effort to find,” has been applied.
5. A user iteratively goes through a Key-in phase and an Editing phase to generate sentences. In the Key-in phase, he may key in words and phrases to compose a sentence. He may also key in phonetic symbol strings in the key-in phase and wait until the editing phase to further reduce the size of candidate words and phrases to perform the selection.
6. Both the Key-in phase and the Editing phase use easy to follow one-way scanning process on the Sentence Editing Buffer.
7. Dividing the input process into two phases relieves a user from the burden of segmenting a sentence into component words and phrases. It has also created a way to harvest system supplied longer generalized phrases.
While it has been illustrated and described what are present considered to be preferred embodiments and methods of the present invention, it will be understood by those skilled in the art that various changes and modifications may be made, and equivalents may be substituted for elements thereof without departing from the true scope of the invention.
Although Chinese language and the Zhu-Yin and the Pin-Yin phonetic systems have been used in the discussions, it will be understood by those skilled in the art that the techniques of the present invention can also be applied to languages other than Chinese language and phonetic systems other than the Zhu-Yin and the Pin-Yin systems. It is intended that this invention not be limited to Chinese language and the Zhu-Yin and the Pin-Yin systems.
In addition, many modifications may be made to adapt a particular element, technique or implementation to the teachings of the present invention without departing from the central scope of the invention. Therefore, it is intended that this invention not be limited to the particular embodiments and methods disclosed herein, but that the invention include all embodiments falling within the scope of the appended claims.