Embodiments relate to a computing device having controls for navigating through dictated text.
A user typically interacts with a computer running a software program or application via a user interface (for example, a graphical user interface (GUI)). The user may use a touchpad, keyboard, mouse, or other input device to enter commands, selections, and other input. However, reading, navigating, and selecting particular portions of text and other elements in a graphical user interface is not possible when a user has impaired vision or when it is impossible or impractical to view the graphical user interface (for example, the user is driving, there is glare from the sun, etc.).
Thus, while graphical user interfaces are useful, there are times when an audio interface that narrates or dictates text is beneficial. Narration-based applications have been developed as a mechanism of providing an audio interface for applications designed for user interaction via a graphical user interface. In cases where a user cannot interact with the screen of their computing device (for example, a smart phone) and wishes to compose material (for example, an email), navigating through dictated text is difficult.
Embodiments of devices, methods, and systems provided herein provide a selection mechanism to facilitate navigation of dictated text. In one example, a pre-existing selection mechanism (for example, volume or microphone controls) is re-configured (or remapped) to navigate through dictated text and to select portions of the dictated text.
Some embodiments of a device, method, and system provided herein automatically modify the volume or microphone controls to permit a user to navigate through dictated text and select the dictated text for modification or replacement.
One embodiment provides a computing device. The computing device include a housing, a selection mechanism included in the housing, a microphone to receive a dictated text, a display device having the dictated text displayed on the display device and an electronic processor. The electronic processor is configured to execute instructions to determine the computing device is in at least one of a voice-recognition state and a playback state; modify a function associated with the selection mechanism based on determining the computing device is in at least one of the voice-recognition state and the playback state; perform a first function using the selection mechanism, wherein the first function includes moving a cursor associated with the dictated text to a new position and generating an audio output associated with the new position of the cursor; and perform a second function in response to selection of the selection mechanism when dictated text is not displayed on the display, the second function different from the first function.
Another embodiment provides a method for controlling navigation through dictated text displayed in a computing device. The method includes determining, with an electronic processor, the computing device is in at least one of a voice-recognition state and a playback state. The method also includes modifying, with the electronic processor, a function associated with a selection mechanism when the computing device is in the voice-recognition state. The method also includes performing a first function using the selection mechanism, wherein the first function includes moving a cursor associated with the dictated text to a new position and generating an audio output associated with the new position of the cursor. The method includes further performing a second function different from the first in response to selection of the selection mechanism when dictated text is not displayed on the display.
A yet another embodiment provides a controller for dictated text navigation. The controller includes a selection mechanism communicatively coupled to a display and an electronic processor. The electronic processor configured to execute instructions to modify a function associated with the selection mechanism based on determining the controller is in at least one of a voice-recognition state and a playback state; perform a first function using the selection mechanism, wherein the first function includes moving a cursor associated with the dictated text to a new position and generating an audio output associated with the new position of the cursor; and perform a second function in response to selection of the selection mechanism when dictated text is not displayed on the display, the second function different from the first function.
The accompanying figures, where like reference numerals refer to identical or functionally similar elements throughout the separate views, together with the detailed description below, are incorporated in and form part of the specification, and serve to further illustrate embodiments of concepts that include the claimed invention, and explain various principles and advantages of those embodiments.
Skilled artisans will appreciate that elements in the figures are illustrated for simplicity and clarity and have not necessarily been drawn to scale. For example, the dimensions of some of the elements in the figures may be exaggerated relative to other elements to help to improve understanding of embodiments provided herein.
The apparatus and method components have been represented where appropriate by conventional symbols in the drawings, showing only those specific details that are pertinent to understanding the embodiments so as not to obscure the disclosure with details that will be readily apparent to those of ordinary skill in the art having the benefit of the description herein.
Before any embodiments are explained in detail, it is to be understood that the invention is not limited in its application to the details of construction and the arrangement of components set forth in the following description or illustrated in the accompanying drawings. The invention is capable of other embodiments and of being practiced or of being carried out in various ways.
The data storage 210 may include a tangible, machine-readable medium storing machine-readable data and information. For example, the data storage 210 may store a database.
The bus 220 or one or more other component interconnections communicatively couples or connects the components of the computing device 100 to one another. The bus 220 may be, for example, one or more buses or other wired or wireless connections. The bus 220 may have additional elements, which are omitted for simplicity, such as controllers, buffers (for example, caches), drivers, repeaters and receivers, or other similar components, to enable communications. The bus 220 may also include address, control, data connections, or a combination of the foregoing to enable appropriate communications among the aforementioned components.
The communication interface 212 provides the computing device 100 a communication gateway with an external network (for example, a wireless network, the internet, etc.). The communication interface 212 may include, for example, an Ethernet card or adapter or a wireless local area network (WLAN) card or adapter (for example, IEEE standard 802.11a/b/g/n). The communication interface 212 may include address, control, and/or data connections to enable appropriate communications on the external network.
In one example, the electronic processor 202 is configured to execute instructions to determine maintain or change between one of two states: a voice-recognition state (for example, when a dictation is being recorded) and a playback state (for example, when a recorded dictation is being played back). In one example, the electronic processor 202 enters the voice-recognition state when the microphone 105 has been activated by a voice that is recognized by the electronic processor 202. The electronic processor 202 may transition to a playback state when audio playback has been activated. In one embodiment, audio playback may be activated using audio playback controls associated with a software program 208. The electronic processor 202 may also be configured to execute instructions to modify a function associated with the input device 104 based on a determination of whether the electronic processor 202 is in either a voice-recognition state or a playback state. In one example, when a user selects a program (for example, a dictation application) within the computing device 100 to perform dictation of textual information, the electronic processor 202 remaps (or changes) the default function (for example, volume control) associated with the input device 104 to a function that provides navigation control. In one example, the remapping of the volume control to the navigation control enables the user of the computing device 100 to navigate through the dictated text by selecting, highlighting and/or replacing portions of the dictated text. The computing device 100 may also provide an onscreen button (for example, a button shown on a touch screen display) that can be activated to begin dictation and/or replace a highlighted text.
In another example, the function of the touch-sensitive button 103 may be modified from controlling a microphone to allowing the user to navigate through dictated text using touch-sensitive button 103. Upon modification, the input device may also be used to select portions of the dictated text that needs to be modified or replaced. In one example, the electronic processor 202 is configured to execute instructions to move a cursor associated with the dictated text to a new position and generate an audio output narrating the new position of the cursor. In another example, the electronic processor 202 is configured to execute instructions to perform a volume control or microphone control function when the dictated text is not displayed on the display 102. The electronic processor 202 may be configured to receive and interpret audio instructions received using the microphone 105 to replace a selected portion of the dictated text with a newly dictated text.
In one example, the input device 104 is configured to select a portion of the dictated text and replace the dictated text with a new text received using the microphone 105. The input device 104 may select a particular portion of the dictated text by navigating a cursor in either a forward or a backward direction to reach the particular portion of the dictated text. In one embodiment, the input device 104 may be operated by an external device (for example, volume controls in a pair of headphones) that is communicatively coupled (using Bluetooth connectivity) to the computing device 100. In one example, when the input device 104 is controlled using a Bluetooth enabled headphones, the Volume Up button is pressed to highlight the next word in relation to the position of a cursor. Similarly, the Volume Down button may be pressed to highlight the previous word in relation to the position of the cursor.
The various buttons (for example touch-sensitive button 103 and/or volume control button 402) associated with the computing device 100 may remapped as follows:
The various buttons associated with the computing device 100 may be remapped as follows:
The range of highlighting actions may include the following:
At block 620, the method 600 includes determining with the electronic processor 202 that the computing device 100 and, more particularly, whether the electronic processor 202 is in at least one of a voice-recognition state and a playback state. In the voice-recognition state, the computing device 100 is configured to receive dictated text and present the dictated text to the visual user interface 112 to be displayed on the display 102.
At block 640, the method 600 includes modifying with the electronic processor 202, a function associated with a selection mechanism (for example, input device 104) when the computing device 100 is in the voice-recognition state.
At block 660, the method 600 includes performing a first function using the selection mechanism. The first function includes moving a cursor associated with the dictated text to a new position of the cursor and generating an audio output associated with the new position of the cursor. In one example, the first function includes replacing a selected portion of the dictated text with a newly dictated text at the new position of the cursor. The first function may also include replacing a word at the new position of the cursor with a new word received using the microphone 105.
At block 680, the method 600 includes performing a second function different from the first in response to selection of the input device 104 when dictated text is not displayed on the display 102 of computing device 100. In one example, the second function includes controlling the volume of the audio output using the input device 104.
In one example, the method 600 includes receiving instructions using the microphone 105 to replace the selected portion of the dictated text with the newly dictated text. In another embodiment, the method 600 includes navigating the cursor in at least one of a forward direction and a backward direction to select a portion of the dictated text using the input device 104.
In some embodiments, the narration controller 312 vocalizes the graphical and textual information associated with items 704, 706, 708, 710, 712, 714, 716, 718, 720, 722, and 724 in response to an input command (for example, using input 104) received from a user. In one example, the input command includes an audio command that may be received using the microphone 105.
One example of the outputting implicit audio narration is provided below.
Timestamp: Friday, October 28th, 2016
Sender: Frank, <frank@example.com>
Receiver: you, Carol Smith <carol@example.com>, Jim <jim@example.com>, Arnold <Arnold@example.com>, Bob <bob@example.com>
Subject: Meet for lunch today?
Message body: Hey all, who is interested in going out to lunch today?
The narration information generated from the various fields associated with the email shown above in Example A are as follows:
Time: On Friday (assuming the time stamp is within the last 7 days)
Sender: Frank
Verb: asked
Direct object: none
Subject: “Meet for lunch today”
The implicit audio narration information that may be generated for the above email is given below:
On Friday, Frank asked, “Meet for lunch today?”
In one example, the input device 104 may be used to move a cursor within the implicit narration information “On Friday, Frank asked, “Meet for lunch Today?” to select a portion of the implicit narration information for replay.
In some embodiments, software described herein may be executed by a server, and a user may access and interact with the software application using a portable communication device. Also, in some embodiments, functionality provided by the software application as described above may be distributed between a software application executed by a user's portable communication device and a software application executed by another electronic process or device (for example, a server) external to the portable communication device. For example, a user can execute a software application (for example, a mobile application) installed on his or her smart device, which may be configured to communicate with another software application installed on a server.
In the foregoing specification, specific embodiments have been described. However, one of ordinary skill in the art appreciates that various modifications and changes may be made without departing from the scope of the invention as set forth in the claims below. Accordingly, the specification and figures are to be regarded in an illustrative rather than a restrictive sense, and all such modifications are intended to be included within the scope of present teachings.
Moreover in this document, relational terms such as first and second, top and bottom, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. The terms “comprises,” “comprising,” “has,” “having,” “includes,” “including,” “contains,” “containing” or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises, has, includes, contains a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. An element proceeded by “comprises . . . a,” “has . . . a,” “includes . . . a,” or “contains . . . a” does not, without more constraints, preclude the existence of additional identical elements in the process, method, article, or apparatus that comprises, has, includes, contains the element. The terms “a” and “an” are defined as one or more unless explicitly stated otherwise herein. A device or structure that is “configured” in a certain way is configured in at least that way, but may also be configured in ways that are not listed.