The present invention relates generally to computer systems, and specifically to a multi-modal input system for a computer system.
As the range of activities accomplished with a computer increases, new and innovative ways to provide an interface with a computer are often developed to complement the changes in computer functionality and packaging. For example, touch sensitive screens can allow a user to provide inputs to a computer without a mouse and/or a keyboard, such that desk area is not needed to operate the computer. Examples of touch sensitive screens include pressure sensitive membranes, beam break techniques with circumferential light sources and sensors, and acoustic ranging techniques. However, these types of computer interfaces can only provide information to the computer regarding the touch event, itself, and thus can be limited in application. Traditional computer input devices can be time-consuming, particularly in computing applications that can require rapid response to changes in feedback information via a display system to one or more users. Furthermore, large computing environments can require inputs from disparate sources and/or concurrent control.
One example includes a computer system. Ports each receive signals corresponding to an interface input associated with user physical interaction provided via an interface device in one of disparate input modes. A multi-modal input system maps an interface input associated with one of the ports provided in a given one of the disparate input modes into a computer input command, maps an interface input associated with another of the ports provided in another one of the disparate input modes into another computer input command, and aggregates the computer input commands into a multi-modal event command. A processor executes a single predetermined function associated with the computer system in response to the multi-modal event command. Thus, the processor is configured to execute the single predetermined function associated with the computer system in response to user physical interaction provided in at least two of the plurality of disparate input modes.
Another example includes a method for providing input to a computer system. The method includes converting a first physical input action provided by a user in a first input mode via a first interface device into a first interface input based on a first API associated with the first interface device and mapping the first interface input to a first computer input command associated with a native schema of the computer system. The method also includes converting a second physical input action provided by the user in a second input mode via a second interface device into a second interface input based on a second API associated with the second interface device and mapping the second interface input to a second computer input command associated with the native schema of the computer system. The method further includes aggregating the first computer input command and the second computer input command into a multi-modal event command and executing a single predetermined function associated with the respective computer system in response to the multi-modal event command.
Another example includes a computer system. The system includes a processor and a plurality of ports that are each coupled to one of a respective plurality of interface devices configured to receive user physical interaction provided in one of a respective plurality of disparate input modes. The system also includes a multi-modal input system. The multi-modal input system includes a deconfliction engine configured to convert the user physical interaction associated with each of the plurality interface devices into a respective plurality of interface inputs via a plurality of APIs associated with the respective plurality of interface devices, and to map the plurality of interface inputs into a respective plurality of computer input commands associated with a native schema of the computer system. The multi-modal input system also includes a multi-modal command aggregation controller configured to aggregate the plurality of computer input commands into a multi-modal event command based on comparing a timer value associated with a modality timer with a predetermined threshold timer value. The system also includes a processor configured to execute a single predetermined function associated with the computer system in response to the multi-modal event command, such that the processor is configured to execute the single predetermined function associated with the computer system in response to user physical interaction provided in at least two of the plurality of disparate input modes.
The present invention relates generally to computer systems, and specifically to a multi-modal input system for a computer system. The multi-modal input system can provide for a more intuitive manner of providing inputs to a computer system using a combination of different input modes. As described herein, the term “input mode” refers to a specific manner of providing inputs to a computer through human interaction with the computer system. Examples of input modes include traditional computer inputs, such as keyboard, mouse, and touch-screen inputs, as well as non-traditional computer inputs, such as voice inputs, gesture inputs, head and/or eye-movement, laser inputs, radio-frequency inputs, or a variety of other different types of ways of providing an input to a computer system. The multi-modal input system can be configured to recognize human (i.e., “user”) interaction (e.g., physical input action) with an interface device to provide a respective interface input. The interface input can be generated, for example, from an application programming interface (API) that is preprogrammed to translate specific human interactive actions into respective specific inputs corresponding to specific functions. The interface input can thus be translated into a computer input command associated with a native schema of the computer system via command mapping adapters. As described herein, the term “native schema” corresponds to machine language understood by the computer system, such that the computer input commands are understood by the software and/or firmware of the computer system to implement specific respective functions. Thus, the interface inputs can be provided to implement the specific functions of the computer system by mapping the human interaction into the interface input via the respective API and by mapping the interface input into the computer input command understood by the computer system via the command mapping adapters.
The multi-modal input system can thus generate a multi-modal event command that is an aggregation of two or more computer input commands that are provided to the computer system in different input modes, with the multi-modal event command corresponding to implementation of a separate respective function for the computer system. As an example, the multi-modal event command can be generated to implement a specific function for the computer system that cannot quickly or easily be performed using a single computer input command corresponding to a single input mode. For example, the multi-modal event command can correspond to an aggregation of a voice input and a gesture input to provide a given command to the computer system to implement a specific function. As an example, the multi-modal event command can be a discrete multi-modal event command corresponding to single function implementation. As another example, the multi-modal event command can correspond to an activation command to initiate a sustained input event, such that additional computer input commands can be provided via one or more of the input modes, with each computer input command corresponding to a single function implementation, during the duration of the sustained input event. Accordingly, the multi-modal input system can be configured to facilitate rapid and intuitive inputs to the computer system, such as for a computer system that controls a very large number of parameters (e.g., a federated mission management system).
The computer input system 10 includes a plurality N of interface devices 18 that are plugged into a plurality N of ports 20 (“P1” through “PN”), where N is a positive integer. The interface devices 18 can each correspond to a device, collection of devices, station, or other types of hardware that are configured to provide a signal or signals in response to user physical interaction that can each be plugged-into or otherwise coupled (e.g., wire or wirelessly) to separate respective ports of the multi-modal input system 22. As described herein, the terms “user physical interaction” and “user physical input action” are interchangeable. As an example, each of the interface devices 18 can correspond to different input modes, and thus can each provide a separate manner of providing a signal or signals in response to user physical interaction. Examples of the input modes that can be employed by the interface devices 18 can include traditional computer inputs, such as keyboard, mouse, and touch-screen inputs to provide signals in response to movement of digits of the user in contact with hardware. Other types of input modes that can be employed by the interface devices 18 can include non-traditional computer inputs, such as voice inputs, gesture inputs, head and/or eye-movement, laser inputs, radio-frequency inputs, or a variety of other different types of ways of providing an input.
The diagram 50 demonstrates an audial input device 52 corresponding to a microphone that is responsive to audial commands provided from the user. As an example, the audial commands can correspond to specific predetermined voice strings, such as one or more words spoken by the user into the audial input device 52. As another example, the audial commands can be sound effects provided by the user via the mouth (e.g., a “shush” sound) or via the body (e.g., a “clap” or “click” sound using one or more of the hands of the user). Thus, the audial input device 52 can provide signals corresponding to human interaction in the form of sound.
The diagram 50 also demonstrates a gesture input device 54 corresponding to a gesture recognition system that is responsive to gesture commands provided from the user. As an example, the gesture input device 54 can be configured to recognize hand-gestures provided by the user, such as gestures provided via the user's naked hand provided over a retroreflective background screen (e.g., in a touchless manner) based on a set of stereo cameras and light sources (e.g., infrared light). As another example, the user can provide the gestures using a sensor-glove or other powered input device (e.g., a stylus). As yet another example, the gesture input device 54 can be associated with other input modes, such as head-movement, shoulder shrugging, leg-movement, or other body motion. Thus, the gesture input device 54 can provide signals corresponding to human interaction in the form of hand gestures and/or body-language.
The diagram 50 also includes a controller input device 56 corresponding to a controller device that is responsive to hand manipulation provided from one or both hands of the user. The controller input device 56 can correspond to a multi-input device that includes both analog and digital controls that a user can manipulate via fingers and hands, such as including buttons, a flight-stick, a joystick, a touchpad, or any other input component on a controller. As an example, the controller input device 56 can be specifically designed for a given input application, such as a piloting controller that emulates a piloting controller of an actual aircraft. As another example, the controller input device 56 can correspond to any of a variety of third-party, off-the-shelf controllers that can be adapted for any of a variety of input purposes, such as console game-system controllers. Thus, the controller input device 56 can provide signals corresponding to human interaction in the form of button pressing and/or analog movements of a joystick or touchpad.
The diagram 50 also includes a personal computer (PC) input device 58 corresponding to any of a variety of typical PC interface devices that are responsive to hand manipulation provided from one or both hands of the user. The PC input device 58 can correspond to a keyboard, a mouse, or any other PC input device. As described herein, the PC input device 58 corresponds to a single input mode, regardless of the inclusion of multiple different types of input devices. Thus, the PC input device 58 can provide signals corresponding to human interaction in the form of button pressing of a keyboard and/or analog movements of a mouse.
The diagram 50 also includes a touch input device 60 corresponding to any of a variety of touchscreen interfaces that are responsive to hand manipulation provided from one or both hands of the user. The touch input device 60 can correspond to a touch-sensitive display screen that is arranged to display visual content and receive touch inputs. For example, the touch inputs can be provided via capacitive sensing, break-beam sensing, pressure sensing, or a variety of other touch-sensitive implementation methods. Thus, the touch input device 60 can provide signals corresponding to human interaction in the form of pressing a touch-sensitive screen.
The diagram 50 also includes a transmitter (“XMITTER”) input device 62 corresponding to a signal transmission device, such as can be handheld by the user. For example, the transmitter input device 62 can include a laser-pointer and/or a radio-frequency (RF) transmitter device that can be tuned to be received by the computer system 12. For example, laser-pointer can be provided to a specific photo-sensitive input of the computer system, or the RF transmitter device can be activated at a specific frequency, to provide an input to the computer screen. Thus, the transmitter input device 62 can provide signals corresponding to human interaction in the form of button pressing to activate an optical or an RF signal that is received by the computer.
Referring back to the example of
The interface APIs 26 are preprogrammed to translate the signal(s) corresponding to the specific human interactive actions from the respective interface devices 18 into respective interface inputs corresponding to specific functions associated with the computer system 12. Additionally, the deconfliction engine 24 can reject signal(s) that result from spurious actions provided through a respective interface device 18. For example, the deconfliction engine 24 can determine if signals from a gesture interface device (e.g., the gesture input device 54) or a voice interface device (e.g., the audial input device 52) of the interface devices 18 correspond to predetermined interface inputs via the interface APIs 26. Thus, the deconfliction engine 24 can be configured to discern unintended actions that can be provided via one or more of the interface devices 18 with intended physical input actions provided by the user via the interface APIs 26.
The memory 28 also includes a set of command mapping adapters 30 that are accessible by the deconfliction engine 24 to translate interface inputs into respective computer input commands associated with a native schema of the computer system 12. In the example of
The multi-modal input system 22 also includes a multi-modal command aggregation controller 34. The multi-modal command aggregation controller 34 is configured to aggregate a plurality of the computer input commands provided via the command mapping adapters 30 into a multi-modal event command. As described herein, the term “multi-modal event command” describes a single computer input that is generated based on a combination of multiple interface inputs provided via a respective multiple different input modes. Therefore, the multi-modal event command is configured to implement a predetermined function associated with the computer system 12. As an example, the multi-modal event command can be a discrete multi-modal event, such that the multi-modal event command implements a single discrete command to the computer system 12 to implement a respective single function. As another example, the multi-modal event command can be an activation command to initiate a sustained input event, such that additional computer input commands provided via the interface devices 18 can implement respective functions with respect to the computer system 12 during the sustained input event, as described in greater detail herein.
In the example of
In addition, the multi-modal input system 22 includes an API interface device 38. The API interface device 38 is configured as a programming interface to facilitate additional interface APIs 26, command mapping adapters 30, and computer input commands for the command repository 32. Therefore, a user can implement the API interface device 38 as a computer terminal, a graphical user interface (GUI) on a website, or as a plug-in port to install the additional interface APIs 26, command mapping adapters 30, and/or computer input commands for the command repository 32 to be stored in the memory 28. Accordingly, the multi-modal input system 22 can be scalable and customizable to allow for the addition of new and useful interface devices 18, or new and useful ways of converting human interaction into interface inputs using the interface devices 18.
The diagram 100 includes a plurality X of computer input commands 102 having been provided to the multi-modal command aggregation controller 34 from the deconfliction engine 24, where X is a positive integer. As an example, X can be two, such that the computer input commands 102 correspond to a combination of two interface inputs having been provided via a respective two of the interface devices 18 and converted to two respective computer input commands via the interface APIs 26 and the command mapping adapters 30, respectively. For example, a first of the computer input commands 102 can be a computer input command corresponding to a voice interface input generated at the audial input device 52 and a second of the computer input commands 102 can be a computer input command corresponding to a gesture interface input generated at the gesture input device 54.
The computer input commands 102 are provided to the multi-modal command aggregation controller 34. In the example of
As an example, the multi-modal event command can correspond to a discrete multi-modal event command 106 corresponding to initiation of a single computer function of the computer system 12. Therefore, in response to executing a discrete multi-modal event command 106, the multi-modal command aggregation controller 34 can await a next set of computer input commands 102 to determine if the next set of computer input commands 102 correspond to another multi-modal event command (e.g., another discrete multi-modal event command 106), such as via the modality timer 104. Thus, in response to receiving a plurality of computer input commands 102 within the predetermined threshold time, the multi-modal command aggregation controller 34 can determine that the plurality of computer input commands 102 correspond to the discrete multi-modal event command 106 (e.g., via the application interface layer 36), and can provide the discrete multi-modal event command 106 to implement a single computer function of the computer system 12.
As another example, the multi-modal event command can correspond to an activation command 108. The activation command 108 can correspond to activation of a sustained input event that facilitate rapid single computer functions in response to discrete computer input commands 110. The discrete computer input commands 110, though depicted as different elements in the diagram 100 of the example of
Thus, in response to receiving a plurality of computer input commands 102 within the predetermined threshold time, the multi-modal command aggregation controller 34 can determine that the plurality of computer input commands 102 correspond to the activation command 108 (e.g., via the application interface layer 36). In response, the multi-modal command aggregation controller 34 can initiate the sustained input event. Accordingly, during the sustained input event, in response to receiving a single discrete computer input command 110, the multi-modal command aggregation controller 34 can translate the single discrete computer input command 110 into a sustained input event command 112 (e.g., via the application interface layer 36) to implement a single computer function of the computer system 12. The multi-modal command aggregation controller 34 can thus continue to translate the discrete computer input commands 110 into the sustained input event commands 112 on a one-for-one basis during the sustained input event until the sustained input event is terminated. As an example, during the sustained input event, the multi-modal command aggregation controller 34 can only be configured to translate discrete single computer input commands 110 into the sustained input event commands 112, such that the multi-modal command aggregation controller 34 can ignore attempts to provide a multi-modal event command that is translated to a discrete multi-modal event command 106. Alternatively, as another example, during the sustained input event, the multi-modal command aggregation controller 34 can limit the sustained input event commands 112 that can be provided, and can still receive plural computer input commands 102 that can correspond to a multi-modal event command that is translated to a discrete multi-modal event command 112.
Additionally, in the example of
The diagram 100 in the example of
In the example of
The computer system 156 can be configured substantially similar to the computer system 12 in the example of
In the example of
As an example, at least one of the input systems 158 can include a plurality of interface devices 160 having a respective plurality of input modes. Therefore, a user of one of the input systems 158 can be able to provide a multi-modal event command from the respective input system 158 based on providing interaction with a plurality of interface devices 160 of different input modes. As another example, multiple users can collaborate to generate a multi-modal event command via a respective plurality of interface devices 160 associated with a plurality of input systems 158. For example, a first user can provide an physical input action via one of the interface devices 160 of a respective one of the input systems 158, and a second user can provide an physical input action via another one of the interface devices 160 of a respective other one of the input systems 158. The computer system 156 can receive the separate respective physical input actions from the interface device(s) 160 of the multiple input systems 158 to generate a single multi-modal event command. Similarly, the multi-modal event command that is generated can correspond to an activation command to initiate a sustained input event to provide the capability of the user(s) to provide discrete computer input command(s) via the input systems 158, either individually, selectively, or collectively. Therefore, the input systems 158 can facilitate generation of multi-modal event commands via multiple input systems 158 that each have one or more interface devices 160 for multiple users in a collaborative control environment.
In view of the foregoing structural and functional features described above, a methodology in accordance with various aspects of the present invention will be better appreciated with reference to
What have been described above are examples of the present invention. It is, of course, not possible to describe every conceivable combination of components or methodologies for purposes of describing the present invention, but one of ordinary skill in the art will recognize that many further combinations and permutations of the present invention are possible. Accordingly, the present invention is intended to embrace all such alterations, modifications and variations that fall within the spirit and scope of the appended claims.