In certain types of situations, audio/visual (A/V) devices can be used in human-to-human, human-to-machine, and machine-to-human interactions. Example A/V devices can include audio headsets, augmented reality/virtual reality (AR/VR) headsets, voice over Internet Protocol (VoIP) devices, video conference devices, etc.
Various examples will be described below by referring to the following figures.
Reference is made in the following detailed description to accompanying drawings, which form a part hereof, wherein like numerals may designate like parts throughout that are corresponding and/or analogous. It will be appreciated that the figures have not necessarily been drawn to scale, such as for simplicity and/or clarity of illustration.
A/V devices may be usable to transmit audio between users, and may also be used to transmit audio between a user and a computing device. For example, an audio headset may be used by a user to give verbal commands to a digital assistant of a computing device. If an A/V device is used substantially concurrently for various human-to-human, human-to-machine, and/or machine-to-human interactions, there may be challenges in directing the respective audio signals to a desired recipient. For instance, a verbal command intended for a digital assistant of a computing device may also be unintentionally directed to other participants on a teleconference.
At times, therefore, interaction among users of computing devices occurs verbally. For example, electronic voice calls (e.g., VoIP), video conferencing, etc. are frequently used in business and personal communications. At times, interactions between a particular user and a computing device can also occur verbally. For instance, speech recognition tools such as DRAGON DICTATION by NUANCE COMMUNICATIONS, INC., and WINDOWS SPEECH RECOGNITION included in WINDOWS operating systems (by MICROSOFT CORPORATION) allow access to functionality of computing devices without necessarily using peripheral devices, such as a keyboard or a mouse. Furthermore, digital assistants, such as CORTANA from MICROSOFT CORPORATION, SIRI from APPLE, INC, GOOGLE ASSISTANT from GOOGLE, INC, and ALEXA from AMAZON.COM, INC, provide a number of ways for computing devices to interact with users, and users with computing devices, using verbal commands. Nevertheless, verbal interactions are at times complemented by legacy interactive approaches (e.g., keyboard and mouse), such as to interact with a computing device or a digital assistant of an electronic device.
In the context of verbal commands, it may be a challenge to direct verbal commands to a desired recipient. For instance, directing voice commands to an electronic voice call or video conference call versus a digital assistant may present challenges. For instance, in one case, while on a video conference call, an attempt to use CORTANA may be performed using a verbal command (e.g., “Hey Cortana”). Signals encoding the voice command may be received by both the program running the video conference call and also by the CORTANA program running in the background. Consequently, participants of the video conference call may hear the “Hey Cortana” command before CORTANA mutes audio input (e.g., from the microphone) into the video conference. Voice commands while on a voice or video call may be distracting or otherwise undesirable. However, in cases where the user happens to be far away from the keyboard or mouse (or the keyboard and mouse are otherwise unavailable), voice commands may be the most expedient method for accessing CORTANA. There may be a desire, therefore, for a method of directing voice commands to a desired recipient program. There may also be a desire to direct voice commands without necessarily installing an application on a computing device (e.g., a program for handling directing audio signals to programs of the computing device). For instance, there may be a desire to limit the applications or programs installed on a computing device.
In the following, transmission of verbal and non-verbal commands using an A/V device is discussed. As used herein, an A/V device is a device that can receive or transmit audio or video signals or that can receive audio or video signals from a user. One example A/V device is a head-mounted A/V device, such as an audio headset that may be used in conjunction with a computing device. As used herein, a computing device refers to a device capable of performing processing, such as in response to executed instructions. Example computing devices include desktop computers, workstations, laptops, notebooks, tablets, mobile devices, and smart televisions, among other things.
It may be possible to instruct computing devices to perform desired functionality or operation by sending a command. Commands refer to instructions that when received by an appropriate computer-implemented program or operating system are associated with a desired functionality or operation. In response to the command, the computing device will initiate performance of the desired functionality or operation.
The present discussion distinguishes between verbal and non-verbal commands. Verbal commands refer to commands transmitted to a computing device via sound waves, such as comprising speech. Non-verbal commands are those given other than using sound waves. Thus, for example, mouse movement, clicks, keyboard keypresses, or gestures are non-limiting examples of non-verbal commands.
Returning to the challenge of directing audio signals, in one case it may be possible to use a non-verbal command to facilitate direction of subsequent verbal commands to a desired program. For instance, one method for directing non-verbal commands to a computing device using an A/V device may comprise use of an actuator of the A/V device and a processor to convert a first command to a second command. In one case, an A/V device, may include a processor to convert signals representing a non-verbal command (e.g., a button press) in one form into signals representing a non-verbal command (e.g., a keyboard keypress or combination of keypresses) in a second form. The signals representing the non-verbal command may be such that they may be received and/or interpreted by a computing device without additional software.
To illustrate, for some computing devices running a WINDOWS operating system (e.g., WINDOWS 10), putting CORTANA in listening mode may be done using the keyboard keypress combination (e.g., a shortcut) of the Windows key plus ‘C’. It may be desirable to input this keyboard keypress combination using by an A/V device (e.g., a headset) to provide a non-verbal command (e.g., such as to put CORTANA in listening mode) to a computing device. Subsequently, verbal commands may be provided using the A/V device.
Therefore, in one example case, if a user is in an audio or video call, the button press may allow the user to provide a verbal command without that command being heard by other participants in the audio or video call. That is, a non-verbal command may be used to assist in directing a subsequent verbal command to a desired recipient program (e.g., a digital assistant). And by converting a first non-verbal command to a second non-verbal command at the A/V device, additional software for directing verbal commands at the computing device may be avoided. Subsequent to the verbal commands to the recipient program, audio signals may again be provided to the audio or video call.
Processor 105 comprises hardware, such as an integrated circuit (IC) or analog or digital circuitry (e.g., transistors) or a combination of software (e.g., programming such as machine- or processor-executable instructions, commands, or code such as firmware, a device driver, object code, etc.) and hardware. Hardware includes a hardware element with no software elements such as an application specific integrated circuit (ASIC), a field programmable gate array (FPGA), etc. A combination of hardware and software includes software hosted at hardware (e.g., a software module that is stored at a processor-readable memory such as random access memory (RAM), a hard disk or solid state disk, resistive memory, or optical media such as a digital versatile disc (DVD), and/or executed or interpreted by a processor), or hardware and software hosted at hardware.
In one case, processor 105 may be used to convert non-verbal commands. For instance, processor 105 may be capable of executing instructions, such as may be stored in a memory of A/V device 100 (not shown) to convert a first non-verbal command into a second non-verbal command. In one case, for example, processor 105 may be capable of consulting a mapping of non-verbal commands in order to perform a conversion of non-verbal commands. For instance, a look up table may be used in order to convert a particular non-verbal command, such as a button press, into a second non-verbal command, such as a keyboard keypress combination. Processor 105 may also enable transmission of the second non-verbal command to a computing device. In one case, processor 105 may transmit the second non-verbal command (e.g., in the form of digital signals) via an interface module (e.g., comprising an output port).
Actuator 115 comprises a component capable of enabling generation of signals indicative of a non-verbal command. In one example case, actuator 115 may comprise a button, and a signal may be generated when the button is actuated. For instance, the button may act as a switch, actuation of which may close a circuit and transmit a signal to processor 105. Actuator 115 may comprise other components, such as sliders or toggles, by way of non-limiting example. In another example, actuator 115 may comprise a plurality of actuators.
In an example case in which A/V device 100 comprises a headset, operation thereof may comprise actuation of actuator 115. Actuator 115 may comprise a button, and actuation thereof may comprise pressing the button. Responsive to actuation, signals may be transmitted from actuator 115 to processor 105. The transmitted signals may be indicative of a first non-verbal command. The first non-verbal command may comprise, for example, a button press.
Processor 105 may convert the first non-verbal command to a second non-verbal command. For instance, in one case, actuation of actuator 115 may be mapped to a particular keyboard keypress, and processor 105 may transmit signals representative of the keyboard keypress, which may be a second non-verbal command, such as to a computing device. If A/V device 100 is connected to a computing device via a USB connection, then the signals representative of the second non-verbal command may be transmitted via data out 110 to a computing device, as illustrated by signals 120. Using a USB-based mode of transmission, signals may be transmitted between data out 110 and a USB component of the computing device. If A/V device 100 is connected to a computing device via a wireless connection, such as a BLUETOOTH connection, then the signals representative of the second non-verbal command may be transmitted wirelessly via data out 110 to a computing device, such as illustrated by signals 120.
As noted above, conversion of the first non-verbal command to the second non-verbal command may comprise referring to a look up table or may comprise consulting a user-programmable mapping between non-verbal commands, by way of example. To illustrate, several example mappings could include a mapping of actuation of actuator 115 to a keyboard keypress for putting CORTANA in listening mode (e.g., Windows+C), a keyboard keypress for initiating DRAGON DICTATION “press-to-talk” mode (e.g., the ‘0’ key on the number pad), putting SIRI into listening mode (e.g., command+space), etc. It should be noted, however, that the present disclosure is not limited to mappings to keyboard keypresses. Indeed, potential mappings could include actuator-to-mouse clicks or gestures, actuator-to-pen swipes or gestures, or actuator-to-touch touches or gestures, by way of non-limiting example. Therefore, as should be appreciated, a number of possible implementations of non-verbal command conversion are contemplated by the present disclosure.
Audio out 204 may comprise a speaker capable of converting audio signals into sound waves. For instance, arrow 220b illustrates sound waves exiting A/V device 200, such as generated by audio out 204. In one example case, audio signals, such as from a program running on computing device 225, may be transmitted to A/V device 200 (e.g., represented by signals indicated by arrow 220c) and received by audio out 204. The audio signals may be converted into audio waves by audio out 204, such as an electrostatic transducer by way of non-limiting example. To illustrate with an example, audio signals may be transmitted from computing device 225 and audio out 204 may convert the audio waves into sound waves, such as may be audibly perceptible to a user.
Actuator 215 and processor 205 may operate similarly to actuator 115 and processor 105 in
In operation, one implementation of A/V device 200 may enable transmission of audio and sound waves between A/V device 200 and computing device 225. For instance, A/V device 200 may be used as part of an audio call in which audio signals are transmitted to computing device 225 and audio signals are received from computing device 225. At a point during the audio call, actuator 215 may be actuated, and signals indicative of a first non-verbal command (e.g., a button press) may be transmitted to processor 205. Processor 205 may convert the signals indicative of the first non-verbal command into signals indicative of a second non-verbal command (e.g., a keyboard keypress). For example, processor 205 may map the first non-verbal command to a second non-verbal command, such as a keyboard shortcut keypress combination to launch (or put into a listening mode) a digital assistant on computing device 225. The second non-verbal command may be transmitted to the computing device for handling by a computer executed program. Subsequently, verbal commands may be given to the digital assistant without necessarily sending the verbal commands to participants of the audio call.
As should be appreciated, a number of possible implementations of A/V device 200 may be realized consistently with the foregoing discussion. For example, in addition to the example of an audio headset usable for audio interactions with a computing device, AR/VR headsets, smart TVs and remotes, smart speakers and smart speaker systems, etc. may operate in a similar fashion consistent with the present disclosure.
In the present disclosure, computing device 325 may have an I/O component, I/O 340, which, similar to I/O 210 in
Computing device 325 may operate in relation to A/V device 300 similarly to operation of computing device 225 in relation to A/V device 200. For example, computing device 325 may receive audio signals from and transmit audio signals to A/V device 300. Processor 330 of computing device 325 may be similar in function to processor 205 of
An actuator of A/V device 300 (e.g., actuator 215 of
In addition, an audio manager program or application may be running on computing device 325 in order to enable muting particular audio signal lines, such as to and from other programs. For example, the audio manager program may be capable of determining whether any programs or applications running on computing device 325 are receiving or transmitting audio signals. And, in response to an actuator of A/V device 300, the audio manager program may mute a particular audio signal line. For instance, if an audio call (e.g., VoIP) is conducted using a program of computing device 325, in response to actuation of an actuator of A/V device 300, the audio manager program may mute an audio signal from a microphone of A/V device 300 as transmitted to the program running the audio call. Of course, there may still be a desire to use audio signals from the microphone in other programs (e.g., with a digital assistant running on computing device 325), and thus, muting of one audio signal line may not necessarily mute that line for all programs. For instance, audio signals from the microphone may be desired in order to interact with a digital assistant, but may also be muted as to another program (e.g., a video conference program).
Another example program running on either processor 330 of
In one implementation, computing device 325 may recognize A/V device 300 as two distinct devices to enable transmission of a converted non-verbal command (e.g., such as in response to installation of a driver for A/V device 300). For instance, A/V device 300 may be recognized as both an audio headset and a USB keyboard. Thus, in the case of a USB device, actuation of an actuator of the headset may be converted and the converted signals sent to computing device 325 as from a USB keyboard. Likewise, the converted signals may be received by computing device 325 as from a USB keyboard.
Turning to
Returning to
At block 415, the converted non-verbal command is transmitted to a computing device. As discussed above, the converted non-verbal command may be representative of a keyboard keypress. Signals representative of the converted non-verbal command may be transmitted via a wired or wireless connection with the computing device. In one example case, the signals may be transmitted between an A/V device and a computing device via a USB connection.
Consistent with the foregoing example, at a first block 505, signals representing a keyboard keypress may be received from an A/V device. In one example case, the signals may be indicative of a Windows+C keyboard keypress combination. In response to the received signals, an audio management program may mute audio signals to and/or from a program, such as shown at block 510.
As discussed above, therefore, an A/V device (e.g., A/V device 100 in
For instance, one implementation of an A/V headset device includes an actuator to transmit signals corresponding to a first non-verbal command to a processor of the A/V headset device. In response to the signals corresponding to the first non-verbal command, the processor is to convert the first non-verbal command to a second non-verbal command. The processor is also to transmit the second non-verbal command to a computing device.
At times, the first non-verbal command comprises actuation of the actuator. In some cases, the second non-verbal command may comprise a keyboard keypress. For instance, the keyboard keypress may comprise Windows+C.
In some cases, the A/V headset device may be connected to the computing device via a wired connection, such as a USB connection. In other cases, the A/V headset device may be connected to the computing device via a wireless connection, such as a BLUETOOTH connection.
In another implementation, an A/V headset device comprises an actuator and a processor. The processor is to, in response to actuation of the actuator, convert a first signal corresponding to a first non-verbal command to a second signal corresponding to a second non-verbal command. The second signal represents a keyboard keypress in this case.
In one case, conversion of the first signal to the second signal is based on a mapping of the keyboard keypress to actuation of the actuator. In one case, the processor is further to change the mapping responsive to signals from a computing device connected to the A/V headset device.
In yet another implementation, a non-transitory computer readable medium comprises instructions that when executed by a processor of a computing device are to cause the computing device to receive a signal corresponding to a keyboard keypress from an A/V device, and mute audio output signals, audio input signals, or a combination thereof of a computer executable program running on the computing device.
In one case, the instructions are also to cause the computing device to determine default audio output devices, audio input devices, or a combination thereof of the computing device. In another case, the muting of audio output signals, audio input signals, or the combination thereof is based on the determined default audio output devices, audio input devices, or the combination thereof. In yet another case, the instructions are to cause the computing device to receive signals corresponding to a mapping of an actuator of the A/V device to the keyboard keypress, and transmit signals corresponding to an updated mapping to the A/V device.
In the preceding description, various aspects of claimed subject matter have been described. For purposes of explanation, specifics, such as amounts, systems and/or configurations, as examples, were set forth. In other instances, well-known features were omitted and/or simplified so as not to obscure claimed subject matter. While certain features have been illustrated and/or described herein, many modifications, substitutions, changes and/or equivalents will be apparent to those skilled in the art. It is, therefore, to be understood that the appended claims are intended to cover all modifications and/or changes as fall within claimed subject matter.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/US2017/050461 | 9/7/2017 | WO | 00 |