This application is directed, in general, to devices, systems and methods for controlling operation of electronic devices.
Various electronic devices include a keypad for data entry. The keypad may be used in some contexts, such as telephone dialing, to enter a single alphanumeric character, e.g. a digit, corresponding to each key. In other contexts the keys may be associated with two or more alphanumeric characters. For example, on the familiar telephone keypad the “number 2” key is associated with “A”, “B”, “C” and “2”. With a key modifier, the key may also be associated with “a”, “b” and “c”. Data entry sometimes includes first pressing the key of interest, and then pressing it again one or more times to select the desired alphanumeric character. Such data entry may be cumbersome and unreliable for some users of such devices.
One embodiment provides an electronic device configured to receive data from a keypad key, wherein the key is associated with first and second alphanumeric characters. The device includes a keypad interface and a data entry processor. The keypad interface is configured to determine the first and second alphanumeric characters when the key is pressed. The data entry processor is configured to select the first alphanumeric character from among the first and second alphanumeric characters when a speech recognizer determines that a spoken entry identifies the first alphanumeric character.
Another embodiment provides a system for entering data into an electronic device. The system includes a receiver, a data discriminator, a speech recognizer and a character transmitter. The receiver is configured to receive keypad entry data from the electronic device. The data discriminator is configured to determine a pressed key from among at least a first key and a second key of the keypad. The speech recognizer is configured to receive a spoken entry that corresponds to a first or a second alphanumeric character associated with the pressed key. The character transmitter is configured to transmit to the electronic device a signal indicating which of the first and second alphanumeric characters is designated by the spoken entry.
Yet another embodiment provides a method, e.g. for forming a keypad-operated electronic device. The method includes configuring a keypad interface to determine that a keypad key has been pressed. A speech recognizer is provided that is configured to process a spoken entry including a spoken equivalent of a first alphanumeric character associated with the key. A data entry processor is coupled to the speech recognizer. The data entry processor is configured to select the first alphanumeric character from among a plurality of alphanumeric characters associated with the key when the speech recognizer determines that the spoken entry identifies the first alphanumeric character.
Reference is now made to the following descriptions taken in conjunction with the accompanying drawings, in which:
Various embodiments described herein provide devices, systems and methods for improving data entry into an electronic device that employs a keypad for data entry. As hand-held electronic devices have become smaller, and include a greater number of features, the complexity of data entry into such devices has increased. Such data sometimes includes, e.g. phone numbers, email messages, text messages and address information. Difficulty entering such data increases the time needed to accurately enter the data, and sometimes causes user frustration.
Some possible strategies for easing the burden of data entry are possible, but deficient in one or more ways. For example, some cellular phones employ a method of multiple key presses, such as first pressing the key of interest, and then pressing it again one or more times to select the desired alphanumeric character. Not only is this system cumbersome, but for users that have large fingers, it may be difficult or nearly impossible to reliability press a single key. Speech recognition may be possible in theory, but typically requires complex algorithms, more powerful processing hardware, greater memory, and a relatively quiet ambient.
The inventors have recognized that data entry to an electronic device may be improved by combining key entry with targeted speech recognition. In various embodiments of the invention, a key may first be pressed. The key is assigned to an alphanumeric character, and associated with one or more other alphanumeric characters. After a user presses the key, the user may speak the assigned or other associated alphanumeric characters. The electronic device or a server in communication with the device may then determine the spoken character, constraining a character search to the assigned and associated characters. The search may therefore be faster and/or require fewer hardware and/or computational resources. Moreover, by constraining the character search, the determination of the selected character is expected to be significantly more robust to background noise that might otherwise mask the spoken character. When the selected, e.g. spoken, character is determined, the device may then register the character in memory.
Herein, the term “alphanumeric character” may be shortened to “character” without loss of generality. Herein, the word “associated” in the context of alphanumeric characters means either: 1) characters assigned to a single key of a keypad, or 2) characters assigned to keys that are the immediate neighbors of a pressed key. Thus, as described further below with reference to
Various embodiments of the disclosure are now presented with reference to the figure. These figures may include various functional modules, and the discussion may include reference to these modules and describe various module functions and relationships between the modules. Those skilled in the art will recognize that the boundaries between such modules are merely illustrative and alternative embodiments may merge modules or impose an alternative decomposition of functionality of modules. For example, the modules discussed herein may be decomposed into sub-modules to be executed as multiple computational processes and, optionally, on multiple electronic devices, e.g. integrated circuits. Moreover, alternative embodiments may combine multiple instances of a particular module or sub-module. Furthermore, those skilled in the art will recognize that the functions described in example embodiment are for illustration only. Operations may be combined or the functionality of the functions may be distributed in additional functions in accordance with the invention.
Turning to
Each of the keys “2”-“9” is associated with a number of characters. For example, each of these keys has a primary assigned character, e.g. “2”. . . “9”. In addition, each includes a number of secondary characters. For example, the secondary characters assigned to the “2” key are “A”, “B” and “C”. Conventionally these characters may be entered into various data fields by the aforementioned technique of multiple key presses. In some cases, the lower case versions of the illustrated secondary characters may also be entered using the multiple key press method.
The device 300 includes a keypad 310, e.g. the keypad 100, a keypad interface 320, a speech-to-text (STT) interface 330, a transducer 340 and a data entry processor 350. The transducer 340 may include, e.g. a conventional microphone element and an analog-to-digital converter (ADC). The keyboard interface 320, STT interface 330 and data entry processor 350 may be implemented by a processor and memory as well understood by those skilled in the pertinent art. Embodiments of the invention are not limited to any particular implementation, which may include without limitation, e.g. a commercial or proprietary integrated circuit, state machine, programmable logic, microcontroller or digital signal processor (DSP).
The keypad 310 has a set of characters that may be produced by appropriate selection of keys. For example, the complete set may include a . . . z, A . . . Z, 0 . . . 9 and some punctuation characters. The keypad interface 320 detects a key press on the keypad 310. The keypad interface 320 is configured to select from the character set a subset of characters that includes the primary character assigned to the pressed character, as well as any secondary characters. Thus, for example, when the “5” key is pressed, the keypad interface 320 may report the character subset {5, j, k, 1, J, K, L} to the STT interface 330.
After pressing the key, a user of the device 300 may then speak one of the characters associated with the pressed key. Continuing the previous example, after pressing the “5” key, the user may speak “j” (pronounced “jay”). The STT interface 330 receives the character subset from the keypad interface 320, and the spoken character from the transducer 340. The STT interface 330 then uses a speech recognition algorithm to determine the spoken character.
As appreciated by those skilled in the pertinent art, speech recognition may include an algorithm that implements a computational model such as the hidden Markov model (HMM). The HMM may include a Viterbi algorithm that may determine a most likely fit between an acoustic signature and a corresponding word.
Unlike a conventional speech recognition algorithm, the speech recognition algorithm of the STT interface 330 is configured to select a character from among the character subset provided by the keypad interface 320. Thus, not only is the universe of possible characters constrained relative to the full character set, but also the STT interface 330 need only detect and fit to a small number of sounds. For instance, in English many of the letters of the alphabet are spoken as a long “E” sound (International Phonetic Alphabet symbol i:) with a unique leading consonant. Because the number of unique sounds available in the full character set, and the further reduction of the number of sounds in the character subset, the complexity of the STT interface 330 may be significantly reduced relative to a conventionally configured speech recognition algorithm. Thus the STT interface 330 may be implemented using significantly less computational and hardware resources than possible for a conventional speech recognition algorithm.
In some embodiments the STT interface 330 may be configured to additionally recognize a small number of modifier keywords. For example, pressing the “2” key and speaking “bee” might indicate a lower case “b” by default. The user might press the “2” key and speak “upper bee” to indicate an upper case “B” is desired. The STT interface 330 may be configured to recognize the word “upper” and modify the selected character accordingly. Alternatively, the STT interface 330 may default to select an upper case character, and select the lower case equivalent only when the user speaks “lower”. Thus, a spoken entry may include in various embodiments a modifier keyword and a character to be modified. Those skilled in the pertinent art will appreciate this strategy may be implemented in many different ways without departing from the scope of the disclosure.
The data entry processor 350 receives the selected character from the STT interface 330 after the STT interface 330 has identified the character specified by the combination of the key press and the spoken character. The data entry processor 350 interfaces with other portions of the device 300 as necessary to effect the character entry, e.g. to a data memory or display memory (not shown).
In the step 420 the keypad interface 320 determines which key is pressed. In a step 430 the keypad interface determines the character subset that is associated with the pressed key. In a step 440 the keypad interface passes the character subset to the STT 330. The STT 330 is configured to match received spoken characters only to characters in the character subset.
In a step 450 the transducer 340 receives a spoken entry and creates a digital representation of the received character. In a step 460 the STT 330 attempts to match the received spoken character to one of the characters in the character subset associated with the pressed key. The matching may include determining if the received spoken entry includes a modifier keyword, such as “upper” as previously described. Thus the STT 330 may include a limited parsing routine to determine the appropriate action to take upon receipt of the modifier keyword. If a match is determined to exist with sufficient confidence, the method 400 advances to a step 470 from which the matching character is reported to the data entry processor 350. If no match is found the method 400 returns to the step 450 to receive another spoken character. The method 400 may optionally, in a step not shown, include a counter to determine if a number of match attempts exceeds a predetermined maximum. If so, the method 400 may return to the step 410 to restart the character identification procedure.
The device 510 may share various features described with respect to the device 300, e.g. a keypad, processor and memory (not shown). The device 510 also includes a transmitter 515 configured to communicate with the server 520 via the connection 525.
The server 520 includes a receiver 530, character discriminator 540, STT 550 and transmitter 560. The discriminator 540 and STT 550 may be implemented by, e.g. a controller or microprocessor in combination with a memory for storing program instructions and transient data.
The device 510 may be configured to transmit to the server 520 the identity of a pressed key. The key may be identified by any method consistent with the nature of the connection 525. For example, when the device 510 is a phone the key may be identified within the voice band, e.g. by DTMF signaling, or out of band by a control signal channel. Other types of electronic devices may, e.g. report the pressed key via a sequence of internet data packets. The receiver 530 receives the signal from the device 510 indicating the pressed key.
The user of the device 510 may then speak the desired character associated with the pressed key. The device 510 conveys the spoken character to the receiver 530 via the connection 525, e.g. by cellular connection or internet. The receiver 530 passes the identity of the pressed key and the spoken character to the discriminator 540. The discriminator 540 operates analogously to the keypad interface 320 to determine a subset of characters that may be associated with the pressed key, and passes the subset to the STT 550.
The STT 550 also receives the spoken command from the receiver 530. The STT 550 operates analogously to the STT 330 to determine from the spoken character which of the characters associated with the pressed key is selected by the user. The STT 550 passes the identified character to the character transmitter 560. The character transmitter 560 transmits the selected character to the device 510, e.g. via an out of band signal or an internet message. The device 510 may then register the selected character by storing the character in memory and/or displaying the character.
Turning to
In a step 610, a keypad interface is configured to determine that a keypad key, e.g. a key of the keypad 310, has been pressed. In a step 620 a speech recognizer is configured to process a spoken entry including a spoken equivalent of a first alphanumeric character associated with the key. For example, the “2” key of the keypad 310 may be associated with “2”, “A”, “B”, or “C”, and the spoken entry may include the spoken equivalent of one of these characters. In a step 630 a data entry processor is configured to select the first alphanumeric character from among a plurality of alphanumeric characters associated with the key, e.g. “2”, “A”, “B”, or “C”, when the speech recognizer determines that the spoken entry identifies the first alphanumeric character.
In some embodiments the method 600 further includes a step 640, in which the speech recognizer is configured to constrain possible alphanumeric character matches to only alphanumeric characters associated with the pressed key.
In some of the above-described embodiments, the speech recognizer is collocated with a server remote from the electronic device.
In some of the above-described embodiments, the keypad is a telephone keypad.
In some of the above-described embodiments, the electronic device and the server are configured to communicate via a cellular communication link.
Those skilled in the art to which this application relates will appreciate that other and further additions, deletions, substitutions and modifications may be made to the described embodiments.