The present invention is directed to the entry of composite characters. In particular, the present invention facilitates the entry of words or characters into communications or computing devices by combining manual user input and speech recognition to narrowly tailor lists of candidate words or characters.
Mobile communication and computing devices that are capable of performing a wide variety of functions are now available. Increasingly, such functions require or can benefit from the entry of text. For example, text messaging services used in connection with cellular telephones are now in widespread use. As a further example, portable devices are increasingly used in connection with email applications. However, the space available on portable devices for keyboards is extremely limited. Therefore, the entry of text into such devices can be difficult. In addition, the symbols used by certain languages can be difficult to input, even in connection with larger desktop communication or computing devices.
In order to facilitate the entry of words or characters, particularly using the limited keypad of a portable telephone or other device, autocompletion features are available. Such features can display a list of candidate words or characters to the user in response to receiving an initial set of inputs from a user. These inputs may include specification of the first few letters of a word, or the first few strokes of a character, such as a Chinese character. However, because the resulting list can be extremely long, it can be difficult for a user to quickly locate the desired word or character.
In order to address the problem of having a long list of auto complete candidates, systems are available that provide a list in which the candidate words or characters are ranked according to their frequency of use. Ranking the candidates according to their frequency of use can reduce the need for the user to scroll through the entire list of candidates. However, it can be difficult to order a list of candidate words or characters in a sensible fashion. In addition, where the user is seeking an unusual word or character, little or no time-savings may be realized.
As an alternative to requiring manual input from a user, voice or speech recognition systems are available for entering text or triggering commands. However, the accuracy of such systems often leaves much to be desired, even after user training and calibration. Furthermore, a full-featured voice recognition system often requires processing and memory resources that are not typically found on mobile communication or computing devices, such as cellular telephones. As a result, speech recognition functions available in connection with mobile devices are often rudimentary, and usually geared towards recognizing a narrow subset of the spoken words in a language. Furthermore, speech recognition on mobile devices is often limited to triggering menu commands, such as accessing an address book and dialing a selected number.
The present invention is directed to solving these and other problems and disadvantages of the prior art. In accordance with embodiments of the present invention, speech recognition is used to filter or narrow a list of candidate composite characters, such as words (for example in connection with English language text) or characters (for example in connection with Chinese text). In particular, following a user's manual input of a letter, stroke or word shape of the word or character being entered, the user may speak that character. Speech recognition software then attempts to eliminate words or characters from the candidate list that sound different from the spoken word or character. Accordingly, even a relatively rudimentary speech recognition application can be effective in at least eliminating some words or characters from the candidate list. Furthermore, by first providing a letter, stroke or other component of a word or character through a selection or input of that component, the range of available or candidate words or characters is more narrowly defined, which can reduce the accuracy required of the speech recognition application in order to further narrow that range (i.e., narrow the candidate list) or positively identify the word or character that the user seeks to enter.
In accordance with embodiments of the present invention, a word or character may be included in a list of words or characters (collectively referred to herein as “characters”) available for selection by a user in response to user input indicating that a particular component of a word or character, such as a letter (for example in the case of an English word) or a stroke or word shape (for example in the case of a Chinese character), is included in the desired character. In addition, the list of characters can be narrowed in response to speech input from the user. In particular, in response to the receipt of speech input from the user that can be used to identify characters in the candidate list that are associated (or not) with the received speech, the content of the candidate list is altered. Accordingly, entry of characters is facilitated by providing a shorter list of candidate words or characters, or by the identification of an exact character, through the combined use of a component of the desired character input by a user, and speech recognition that receives as input the user's pronunciation of the desired character.
With reference now to
A communication or computing device 100 may additionally include memory 108 for use in connection with the execution of programming by the processor 104 and for the temporary or long term storage of data or program instructions. The memory 108 may comprise solid state memory resident, removable or remote in nature, such as DRAM and SDRAM. Where the processor 104 comprises a controller, the memory 108 may be integral to the processor 104.
In addition, the communication or computing device 100 may include one or more user inputs 112 and one or more user outputs 116. Examples of user inputs 112 include keyboards, keypads, touch screen inputs, and microphones. Examples of user outputs 116 include speakers, display screens (including touch screen displays) and indicator lights. Furthermore, it can be appreciated by one of skill in the art that the user input 112 may be combined or operated in conjunction with a user output 116. An example of such an integrated user input 112 and user output 116 is a touch screen display that can both present visual information to a user and receive input selections from a user.
A communication or computing device 100 may also include data storage 120 for the storage of application programming and/or data. In addition, operating system software 124 may be stored in the data storage 120. The data storage 120 may comprise, for example, a magnetic storage device, a solid state storage device, an optical storage device, a logic circuit, or any combination of such devices. It should further be appreciated that the programs and data that may be maintained in the data storage 120 can comprise software, firmware or hardware logic, depending on the particular implementation of the data storage 120.
Examples of applications that may be stored in the data storage 120 include the speech recognition application 128 and word or character selection application 132. In addition, the data storage 120 may contain a table or database of candidate words or characters 134. As described herein, a speech recognition application 128, character selection application 132 and/or table of candidate words or characters 134 may be integrated with one another, and/or operate in cooperation with one another. The data storage 120 may also contain application programming and data used in connection with the performance of other functions of the communication or computing device 100. For example, in connection with a communication or computing device 100 such as a cellular telephone, the data storage may include communication application software. As another example, a communication or computing device 100 such as a personal digital assistant (PDA) or a general purpose computer may include a word processing application and data storage 120. Furthermore, according to embodiments of the present invention, a speech recognition application 128 and/or character selection application 132 may operate in cooperation with communication application software, word processing software or other applications that can receive words or characters entered or selected by a user as input.
A communication or computing device 100 may also include one or more communication network interfaces 136. Examples of communication network interfaces include cellular telephony transceivers, a network interface card, a modem, a wireline telephony port, a serial or parallel data port, or other wireline or wireless communication network interface.
With reference now to
When in a text entry or selection mode, a user can, in accordance embodiments with the present invention, cause a partial or complete list containing one or more words or characters to be displayed in the display screen 216, in response to input comprising specified letters, strokes or word shapes entered by the user through the keypad 204. As can be appreciated by one of skill in the art, each key included in the keypad may be associated with a number of letters or character shapes, as well as with other symbols. For instance, the keypad 204 in the example of
The list of candidate characters created as a result of the selection of letters or word shapes is displayed, at least in part, by the visual display 216. If the list is long enough that it cannot all be conveniently presented in the display 216, the cursor button 208 or some other input 112 may be used to scroll through the complete list. The cursor button 208 or other input 112 may also be used in connection with the selection of a desired character, for example by highlighting the desired character in a displayed list using the cursor button 208 or other input 112, and then selecting that character by, for example, pressing the enter button 212. In addition, as described herein, the list of candidate characters can be narrowed based on speech provided by the user to the device 100 through the microphone 214 that is then processed by the device 100, for example, through the speech recognition application 128. Furthermore, the speech recognition application 128 functions in cooperation with the character selection application 132 such that the speech recognition application 128 tries to identify characters included in a list generated by the character selection application 132 in response to manual or other user input specifying a component of the desired character, rather than trying to identify all words that may be included in the speech recognition application 128 vocabulary.
With reference now to
The user may then choose to narrow the candidate list by providing speech input. Accordingly, a determination may then be made as to whether speech input from the user is received and recognized as representing or being associated with a pronunciation of a candidate character (step 320). In particular, speech received, for example through a microphone 214, is analyzed by the speech recognition application 128, to determine whether a match with a candidate character can be made. If a match can be made, a revised list of candidate characters is created (step 324). As can be appreciated by one of skill in the art, even a rudimentary speech recognition application 128 may be capable of positively identifying a single character from the list, particularly when the list has been bounded through the receipt of one or more components that are included in the character that the user wishes to enter. As can also be appreciated by one of skill in the art, a speech recognition application 128 may be able to reduce the size of a list of candidate characters, even if a particular character cannot be identified from that list. For example, where the speech recognition application 128 is able to associate speech input by the user with a subset of the list of candidate characters, the revised list may comprise that subset of characters. Accordingly, a speech recognition application 128 may serve to eliminate from a list of candidates those words or characters that have a spoken sound that is different from the spoken sound of the desired word or character. Accordingly, the number of candidates that a user must (at least at this point) search in order to find a desired word or character is reduced. At least a portion of the revised list is then displayed to the user (step 328). Should the revised list contain too many candidates to be displayed by a user output 116, such as a liquid crystal display 216, simultaneously, the user may again scroll through that list.
At step 332, a determination may again be made as to whether the user has selected one of the candidate characters. This determination may be made either after it is determined that the user has not provided speech in order to produce the list of candidate characters, or after creating a revised list of candidate list of characters at step 328. If the user has selected a listed character, the process ends. The user may then exit the text mode or begin the process of selecting a next character.
If the user has not yet selected a listed character, the process may return to step 304, at which point the user may enter an additional component, such as an additional letter, stroke or word shape. The list of characters that may then be created at step 308 comprises a revised list of characters to reflect the additional component that has now been specified by the user. For instance, where a user has specified two letters or word shapes, those letters or word shapes may be required in each of the candidate characters. The resulting list may then be displayed, at least in part (step 312). After displaying the revised list to the user at step 312, the user may make another attempt at providing speech input in order to further reduce the number of candidate characters in the list (step 320). Alternatively, if a selection of a listed character is not made by the user at step 332, the user may decide not to provide additional input in the form of an additional component of the desired composite character at step 312 and may instead proceed to step 320, to make another attempt at narrowing the list of candidates by providing speech input. If additional speech input is provided, that input may be used to create a revised list of candidate characters (step 324) and that revised list can be displayed at least in part, to the user (step 328). Accordingly, it can be appreciated that multiple iterations of specifying components of a word or character and/or providing speech to identify a desired word or character or to at least reduce the size of the list of candidates, can be performed.
With reference now to
Because Chinese characters are formed from eight basic strokes, and because there are many thousands of Chinese characters in use, specifying two strokes of a desired character will typically result in the generation of a long list of candidate characters. A partial list 406a of candidate characters 408a-d that begin with the strokes 404 specified in the present example is illustrated in
Furthermore, even if the speech recognition software 128 is unable to discern the desired character from the spoken sound with reference to the list of candidate characters generated in response to one or more manually entered strokes, it should be able to narrow the list of candidate characters. For example, the speech recognition software 128 may not be able to discern between the second 408b (“wo”) and third 408c (“ngo”) characters based on the user's speech input while the list of candidate characters shown in
Although certain examples of embodiments of the present invention described herein have discussed using manual entry through keys in a keypad of one or more components of a desired word or character, and/or the selection of a desired word or character, embodiments of the present invention are not so limited. For example, manual entry may be performed by making selections from a touch screen display, or by writing a desired component in a writing area of a touch screen display. As a further example, the initial (or later) selection of a component or components of a word or character need not be performed through manual entry. For instance, a user may voice the name of the desired component to generate a list of words or characters that can then be narrowed by voicing the desired word or character. In addition, embodiments of the present invention have application in connection with the selection and/or entry of text in any language where the “alphabet” or component parts of words or symbols is beyond what can be easily represented on a normal communication or computing device keyboard.
The foregoing discussion of the invention has been presented for purposes of illustration and description. Further, the description is not intended to limit the invention to the form disclosed herein. Consequently, variations and modifications commensurate with the above teachings, within the skill or knowledge of the relevant art, are within the scope of the present invention. The embodiments described hereinabove are further intended to explain the best mode presently known of practicing the invention and to enable others skilled in the art to utilize the invention in such or in other embodiments and with the various modifications required by their particular application or use of the invention. It is intended that the appended claims be construed to include alternative embodiments to the extent permitted by the prior art.