The invention relates generally to text editing, and more particularly, to a method and a system for selection of text for editing.
Speech recognition is a process of analyzing speech input to determine its content. Speech recognition systems are used widely nowadays in many devices for controlling the functions of the devices. For example, a mobile phone user may speak to the mobile phone speaker the name of the person he or she wants to call. A processor in the mobile phone analyzes the speech of the user using a speech recognition technique and dials the number for that person.
Speech recognition is also used widely for dictation purposes. In a typical dictation application, a user provides speech input to a speech recognition system. The speech recognition system identifies the speech input by using acoustic models. The identified speech input is subsequently converted into recognized text and displayed to the user.
Speech recognition systems typically perform at much less than 100% accuracy. Therefore, speech recognition systems normally also provide error correction for correcting text. A typical error correction method includes proof-reading the recognized text, selecting a wrongly recognized word, and correcting the selected word. The user may correct the selected word by re-dictating the word. The system may also generate an alternate word list for the selected word, and the user corrects the selected word by choosing the correct word from the alternate word list.
The wrongly recognized word in the speech recognition system may be selected by using a mouse or any input pointing device. However, the use of a mouse or any input pointing device may not be convenient when the dictation function is used in devices which do not have any input pointing device, for example, mobile phones.
It is also possible to select the wrongly recognized word using voice. For example, the user may issue a voice command “edit word DATA”. The system then looks for the most recent occurrence of the word DATA and selects it. However, the selection of the wrongly recognized word using voice is prone to errors. Also, even when both modes of word selection using an input pointing device or a voice command are provided, the switching between these two modes of word selection is not convenient.
Therefore, it is desirable to have an improved and accurate way of selecting the wrongly recognized word or text unit for editing.
In an embodiment, a method for selection of text for editing is provided. The method includes inputting text to an apparatus and generating a label for at least one unit of the text as the text is being input to the apparatus. Accordingly, a user is able to select the at least one text unit for editing by selecting the corresponding label of the text unit.
The embodiments of the invention will be better understood in view of the following drawings and the detailed description.
In an alternative embodiment, the text may be directly provided to the data unit 102 in electronic form for processing. The text may be a Short Message Service (SMS) message received in a mobile phone which a user wishes to edit and retransmit. The text may able be pre-existing text received by a device, for example a Personal Computer (PC) or a Personal Digital Assistant (PDA), electronically. Therefore in this alternative embodiment, the SR unit 101 may be omitted.
A label unit 103 generates a label for one or more units of the text (text unit). The label for the text unit may be a unique number, character, word or symbol. Each label corresponds to one text unit. Accordingly, the user is able to select each text unit by selecting its corresponding label. A text unit may be a character, a word, a phrase, a sentence, a line of the text or any other suitable units. The text unit may be defined by the user using a definition unit 104 in an embodiment. It is possible to define the text unit to be a word by default in one embodiment. In another embodiment, a line of the text may be defined as a primary text unit, and a word may be defined as a secondary text unit.
The system 100 may include a dictionary unit 105 in one embodiment. The dictionary unit 105 compares the text with a dictionary to determine if the text is correct. The dictionary unit 105 may be a separate unit, or included as part of the SR unit 101. In an embodiment, the label unit 103 generates labels only for text units which have been identified as wrong by the dictionary unit 105.
The system 100 further includes a display unit 106 for displaying the text and its corresponding label on a display screen. In the embodiment where the system includes the dictionary unit 105, only text units identified as wrong by the dictionary unit 105 would have a label being displayed together with them by the display unit 106. The display unit 106 may be a monitor in an embodiment.
When the user dictates to the system 100, he or she is able to see the text units and their corresponding labels being displayed. When the user spots an error in any one of the displayed text unit, he or she selects the corresponding label through an input unit 107 of the system 100. The input unit 107 may include a speech recognition system in one embodiment. In this embodiment, the user selects the desired label by dictating the corresponding label. Accordingly, a speech input is provided by the user through dictation to the speech recognition system in the input unit 107 and is recognized. Based on the recognized speech input, the corresponding label is selected. In an alternative embodiment, the input unit 107 may be a keyboard and the user selects the label by pressing one or more corresponding keys on the keyboard. The system 100 identifies the text unit corresponding to the label selected by the user, and allows the user to edit the text unit, for example, by re-dictating the text for the text unit.
The program memory 204 stores data and programs such as the operating system 210, the SR unit 101, the data unit 102, the label unit 103, the definition unit 104 and the dictionary unit 105 of the system 100. The I/O unit 202 provides an input and output interface between the computer system 200 and I/O devices such as the display 205 and keyboard 206. The sound card 203 converts analog speech input captured by the microphone 207 into digital speech samples. The digital speech samples are received by the SR unit 101 as speech input. Subsequent processing of the speech input is similar to the processing by the system 100 as already described above.
It should be noted that the computer system 200 described above is only one possible implementation of the system 100. The system 100 may be implemented in other devices, such as a mobile phone, in other embodiments.
Step 302 includes generating a label for at least one unit of text as the text is being input to the apparatus. In an embodiment, a label is generated automatically for every text unit received by the apparatus. The label generated is unique and associated with the corresponding text unit. Accordingly, a user can select a text unit simply by selecting the label associated with the text unit.
To illustrate the method of selection of text for editing according to the embodiment, a detailed example of the method is described.
Step 401 includes providing speech input. The speech input may be provided by a user dictating to the microphone 207 connected to the computer system 200. The sound card 203 receives and converts the analog speech input into digital speech input for further processing by the computer system 200. Step 402 includes allowing the user to decide whether to select a speech recognition system for processing the speech input. The computer system 200 may include several speech recognition systems in the SR unit 101. The computer system 200 may display the available speech recognition systems to the user on the display 205, and the user selects the desired speech recognition system using the keyboard 206 at Step 403. Alternatively, the computer system 200 always uses a default speech recognition system unless the user chooses another speech recognition system to be used. In an embodiment, the computer system 200 may include only one speech recognition system. Accordingly, Step 402 and Step 403 may be omitted in this embodiment.
Step 404 includes converting the speech input into text. The conversion from speech to text is usually done by the speech recognition system after the speech input has been recognized. The converted text becomes the text input for the data unit 102. Step 405 includes asking the user whether he or she wants to define a text unit. A text unit may be defined as a character, a word, a phrase, a sentence, a line of the text or any other suitable units. If the user wants to define the text unit, he or she defines the text unit at Step 406. In an embodiment, the text unit is defined as a word by default. In an alternative embodiment, the user may define a line as a primary text unit and a word as a secondary text unit. The text unit (primary and/or secondary) definitions made by the user at Step 405 and Step 406 may be set as default, and hence, omitted for subsequent processing. The user proceeds to Step 406 only if he or she wants to change the definitions of the text unit.
Step 407 includes selecting the dictionary mode. In a default mode, the dictionary mode is not selected, and the label unit 103 of the computer system 200 proceeds to generate the labels for each text unit in Step 408. The labels for the text units may be numbers (for example 1, 2, 3, . . . ), characters (for example a, b, c, . . . ), symbols (for example @, #, $, . . . ) or words, or any labels that can be accurately recognized by the speech recognition system. In an alternative embodiment when the dictionary mode is selected at Step 407, the label unit 103 generates the labels only for text units which are identified as wrong by the dictionary unit 105 in Step 409.
Step 410 includes displaying the text units and the generated labels. If the dictionary mode was selected at Step 407, all the text units and the labels for those text units identified as wrong by the dictionary unit 105 are displayed. If the dictionary mode was not selected, all the text units and their corresponding labels are displayed. Each generated label may be displayed adjacent to its corresponding text unit in an embodiment. In alternative embodiments, each generated label may be displayed above or below its corresponding text unit.
Step 411 includes choosing a mode for selecting the text unit for editing. In a default mode, speech selection mode is chosen. It is also possible for the user to choose a keyboard selection mode at Step 411. In the default speech selection mode at Step 413, the user selects the desired text unit by dictating the corresponding label of the desired text unit. In the keyboard selection mode at Step 412, the user selects the desired text unit by pressing one or more keys of the keyboard which corresponds to the label of the text unit.
In the embodiment when a line is defined as the primary text unit and a word as the secondary text unit, primary labels for the lines and secondary labels for the words in each line are generated. The labels for each line of the text are displayed. When one of the lines is selected, the secondary labels for the selected line are also displayed. It should be noted that the primary and the secondary text units may be defined differently in other embodiments. For example, the primary text unit may be defined as a paragraph, and the secondary text unit may be defined as a line in another embodiment.
Once the desired text unit is selected at Step 412 or Step 413, the user edits the selected text unit at Step 414. In an embodiment, the user edits the selected text unit by re-dictation. The user may also edit the selected text unit by entering the desired text unit using the keyboard or choosing from a list of alternative text units in other embodiments.
Although the present invention has been described in accordance with the embodiments as shown, one of ordinary skill in the art will readily recognize that there could be variations to the embodiments and those variations would be within the spirit and scope of the present invention. Accordingly, many modifications may be made by one of ordinary skill in the art without departing from the spirit and scope of the appended claims.
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/IN05/00349 | 10/31/2005 | WO | 00 | 3/18/2008 |