1. Field of the Invention
The present invention relates to a gesture-based laser pointer system for user-application interfaces.
2. Description of Related Art
Public speaking and making presentations to an audience are stressful tasks, even to the most skilled public speakers. It is important for a person making a presentation to focus completely on the audience in order to completely and effectively convey his or her message. The stress levels for presenters are multiplied several times when they also have to manage a presentation, such as a PowerPoint presentation, in addition to making a persuasive pitch to their audience.
A presenter making a PowerPoint presentation has a problem of talking to the audience and navigating through the presentation at the same time. In many cases, it requires two people to make a presentation—one who is making the speech and the other who controls the presentation slides. It is difficult to make a seamless presentation when a presenter has to coordinate between the content of his speech and matching the content of his speech with the slide on the screen.
It is extremely distracting to the presenter to multitask while already performing the difficult task of public speaking in front of a group of people. A presentation system is desired where the presenter can make a seamless presentation without having to click on a mouse to change the display image on the screen while he is also talking to his audience.
The presenter is given a tool that is useful for highlighting locations on the screen in real time, and the power to navigate the presentation with only one hand-held device. A laser pointer is combined with a gesture-based input system that is used as an input device for delivering commands to a host computer to enable presenters to make a seamless presentation, using the laser pointer to highlight content on a screen and in addition as a mount for a motion sensor used for the gesture-based input.
The laser point includes a laser and a motion sensor comprising at least one small sensor such as a micro-electromechanical sensor MEMS, along with a signal accumulation unit connected to the sensor on the laser pointer. The signal accumulation unit includes logic for packaging data from the motion sensor to produce packaged data. The signal accumulation unit also includes a communication port for communicating with a host computer by which the packaged data is sent to the host. The host computer includes resources that, in cooperation with the processing at the signal accumulation unit, interpret the gesture input data, and generate a resulting input signal. The input signal is then delivered by the host to a target system using an appropriate computer generated message. Representative target systems include such programs as business presentation software, software managing audio visual equipment, and so on.
As described herein, a laser pointer device with a gesture input system produces commands used by presentation software. A library of gestures is described which are interpreted as commands for the presentation program, including for example commands for advancing a presentation to a next page, or for returning to a previous page. These gestures are easy to execute using a laser pointer device, and can address problems associated with the dexterity required for controlling the presentation equipment while also delivering the presentation as described above.
The motion sensor can be implemented using one or more MEMS elegantly utilized to produce data in one or more than one spaces, where a space includes at least two dimensions sampled over time, including displacement, velocity and acceleration for translation in linear space and displacement, velocity and acceleration for rotation in angular space. The use of multiple space analysis using gesture data in more than one space from multiple sensors mounted at different locations, and/or more than one space from one or more sensors mounted at a single location, for analysis of gestures improves the power of the recognition systems significantly, enabling the interpretation of complex gestures. Multiple space analysis interprets various laser pointer movements to perform specific actions, in addition to next page or previous page commands, such as scrolling a slide from side to side or up and down, zooming on a feature of a page, flipping a page on screen or highlighting other features of a presentation.
A host computer system is described that includes an interface for communication with a signal accumulation unit on a user, and resources for interpreting the data in multiple spaces. Resources include, in addition to data processing hardware, a database of gesture specifications including one or more specifications of gestures in multiple spaces, and programs for comparing input data to the specifications in the database. Also, resources in the host include communication resources for composing a message including the results of interpretation of the gesture data, and sending the message to a target where the data is utilized as an input command or data.
The presenter is able to navigate a presentation program using a gesture-sensing laser pointer in real time (i.e. without interrupting the presentation by stopping to find a switch on the projector or computer), improving the interactivity with the audience. Also, the gesture-sensing laser pointer gives the presenter better control over the mood and pace of the presentation.
Other aspects and advantages of the present invention are provided in the drawings, the detailed description and the claims, which follow.
As used in the description of the present invention, a laser pointer is broadly construed to include a pointer that emits a beam of any collimated or highly focused visible light pointing device and is not limited to a beam light created by a laser. Also, contemporary presentations typically comprise a series of frames or slides such as created by PowerPoint sold by Microsoft Corporation. Each slide can include still images, animation or incorporate video for informing or entertaining the audience. As used herein, however, any form of presentation is contemplated for use with the present invention.
Because of the very small size and low weight of the sensors and supporting circuitry, the sensor units may be attached to, or mounted on or within, the laser pointer. The laser pointer could include a laser pointer or a similar device which is used to assist in presentations.
Representative sensor units include inertial sensors and gyroscopes capable of sensing up to 6 degrees of motion, including translation on the x-, y- and z-axes, and rotation on the x-, y- and z-axes. The motion can be interpreted by breaking down the sensor data in displacement, velocity and acceleration spaces for both translation and rotation. Many sensors, sensing many axes and types of motion, can provide substantial information to be used for enhancing the quality of presentations by flipping the page or slide with one gesture, moving up and down the screen with another gesture, controlling video functions, such as volume, rewind, forward, and for distinguishing between gestures. In addition, a single sensor can provide input in both linear and angular acceleration space, velocity space, and displacement space, giving rich input data practically unavailable in prior art vision-based systems.
For the purposes of this specification, a micro-electromechanical sensor MEMS is any one of a class of sensors consisting of a unit that is small and light enough to be attached to a laser pointer, and can be defined as die-level components of first-level packaging, and include pressure sensors, accelerometers, gyroscopes, microphones, etc. Typical MEMS include an element that interacts with the environment, having a width or length on the order of 1 millimeter, and can be packaged with supporting circuitry such as an analog-to-digital converter, a signal processor and a communication port.
Representative MEMS suitable for the gesture-based laser pointer described herein include two axis accelerometers. For a given application, two of such accelerometer sensors can be mounted in a single location to sense multiple three of linear acceleration. Other representative MEMS for the gesture-based systems described herein include gyroscopes, including piezoelectric vibrating gyroscopes.
The host machine 10 and the signal accumulation unit 18 comprise data processing resources which provide for interpretation of the gesture data received from the sensors located on the laser pointer. In some embodiments, the signal accumulation unit 18 performs more interpretation processing than in other embodiments, so that the host machine 10 performs different amounts of interpretation processing depending on the complementary processing at the signal accumulation unit 18. The interpreted gesture data is processed by the host to produce a specific signal. The host machine 10 determines the specific signal as the result of the interpreted gesture data determines the target for that specific signal and issues the resulting signal to the target. The target may comprise a display screen formed by a projector, computer program running on the host machine 10 or running on other systems operating in the environment of the user, with which the user is interacting via the gesture language. Thus, the gesture data is delivered from the user to the host machine to the environment, and used for controlling the projector screen in the environment, including translating the gesture language into signals controlling audiovisual devices.
The host machine 10 also includes resources which act as a feedback provider to the user. This results in an interaction loop in which the user provides a gesture signal to the host machine, which interprets the signal and produces a response. For example, the user makes a gesture with the laser pointer, the MEMS with gesture sensing capability located on the laser pointer, to go to the next slide or page in the presentation. The signal accumulation unit interprets gesture data commands from the user such as ‘go to next page’ or ‘go to previous page.’ The translated message is then wirelessly sent to the computer where it is interpreted as the ‘go to next page’ or ‘go to previous page’ command and then executed by the PowerPoint, or other type of similar application, to update the displayed image. This enables the user to move smoothly through his or her presentation without having to worry about managing the presentation loaded on the computer and effectively communicating his or her message to the audience at the same time.
The host machine 10 can include a map database including the specifications of gestures to be used with the laser pointer, and a mapping of the gestures to specific signals. A pre-specified gesture in the database can be defined as a movement of the laser pointer from left side to right side which can be associated with the function of skipping ahead to the next slide in the presentation. Similar gestures can be pre-defined to be associated with a particular function to be performed by the laser pointer. The host machine 10 may include a computer program that provides an interactive learning process, by which the user is presented with the specifications of the specific gesture on the laser pointer, and then makes the gesture on the laser pointer in an attempt to match the presented specifications. This provides a learning loop in which the computer enables a user to learn a library of gestures for interaction with the computer system.
The host machine 10 can include an interactive program, by which a user defines the specifications of gestures to be utilized. A specific gesture with the laser pointer can be defined to be interpreted as highlighting the document or emphasizing a word or for other similar presentations.
A system as described herein can be implemented using sensors that describe motion of the sensor in space, including providing gesture data concerning up to 6 degrees of freedom, including 3 degrees of freedom in translation in linear space provided by accelerometer and 3 degrees of freedom provided in rotation in angular space by a gyroscope. It is also possible, theoretically, to describe the displacement of an object in space using an accelerometer for all 6 degrees of freedom, or using a gyroscope for all 6 degrees of freedom. Using multiple spaces provided by sensing function with respect to up to 6 degrees of freedom, can enable a system to distinguish between complex gestures reliably and quickly. The gesture data produced during movement of the sensors, located on the laser pointer, through a given gesture can be analyzed by displacement, velocity, acceleration in linear and angular spaces.
For example, if the MEMS-based sensors detect specific gestures made using the laser pointer, the presentation page can be moved up and down. If a video is being displayed on the display screen, then specific gestures can be used on the laser input device to skip the video forward or backward or increase or decrease the volume of the video.
If the user rotates the laser pointer in space, with near constant angular speed in the time domain, then the motion will appear as a fixed spot in angular velocity space. The motion will also appear as a fixed spot at (0,0,0) in angular acceleration space, e.g. it has zero angular acceleration across a time domain.
For another example, if the user draws a linear line in space with the laser pointer, with constant linear speed in the time domain, then the motion will appear as a fixed spot in linear velocity space. The motion will also appear as a fixed spot at (0,0,0) on linear acceleration space, e.g. it has zero linear acceleration across time domain.
The presentation program and optional gesture analysis processes use data processing resources including logic implemented as computer programs stored in memory 101 for an exemplary system. In alternatives, the logic can be implemented using computer programs in local or distributed machines, and can be implemented in part using dedicated hardware or other data processing resources. The logic in a representative gesture analysis system includes resources for interpretation of gesture data and for delivery of messages carrying signals that result from the interpretation, and resources for gesture language learning and self-learning processes. The presentation process can be a program such as PowerPoint, with pre-specified application program interfaces for accepting commands, such as next page, previous page, zoom, pan and so on, from other programs and input devices, such as the gesture-sensing laser pointer described herein. Presentation programs also support video clips or movies, in which commands are accepted for fast forward, reverse, pause, and up/down volume controls that can be produced using the gesture-sensing laser pointer.
The data store 102 is typically used for storing a machine-readable gesture dictionary including definitions of gestures on the laser pointer and other data intensive libraries. Large-scale memory is used to store multiple gesture dictionaries for example, and other large scale data resources.
During the wait state, input from the sensors is gathered, filtered and analyzed to determine whether valid gesture input signals are received. The input signals can be delineated using mechanical or audio signals, or recognized as a result of specific gesture commands, or the like. The input data can be further formatted for interpretation of displacement, velocity and acceleration along various linear and angular axes as mentioned above. The resulting data is then compared with information in a gesture or component motion database. If a match is discovered, then an output command byte is produced and delivered to the host computer as a gesture language/instruction command at system output.
After the gesture or component motion has been interpreted and delivered to the host system, the host system can apply further processing to identify the intended input signal, such as for gestures that comprise a sequence of component motions, or in the case that the gesture is fully identified in the signal accumulation unit, sends a message to a target process which executes a command indicated by the signal, or processes the data indicated by the signal appropriately.
The MEMS sensor units are ultra light, and very small so that they can be easily attached to the laser pointer. This technology makes it possible to shift between pages or slides in a presentation and control video outputs by a single gesture of holding the laser pointer. Also, sophisticated gestures can be utilized through sensing displacement, velocity, acceleration and both linear and angular spaces. The system is capable of learning user-defined gestures for customized user language and commands.
Another embodiment of this system includes a kit where the laser pointer system is coupled with a computer program stored on a machine readable medium such as DVD, CD, floppy disk or similar storage devices. The computer program in the kit manages the communication with the signal accumulation unit located on the pointer using a Bluetooth driver and command translator. This software program can be uploaded to a computer in order to enable the computer to translate the message sent to it by the signal accumulation unit located on the laser pointer to update the presentation in accordance with the specific gesture provided by the user and interpreted by the signal accumulation unit.
The computer program 64 includes drivers used to adapt the host computer to the system requirements of running the computer program. The computer program 64 further includes a database which contains pre-specified motions and their associated functions. The computer program can include logic 62 to compare data that it receives from the laser pointer MEMS and the signal accumulation unit with the pre-specified motions in the database. The program then finds the appropriate function that is triggered when the program finds a match between the pre-specified gesture and the gesture received from the laser pointer.
After the program has found the match between the database components and the pre-specified gesture, the logic 61 such as a Bluetooth compatible driver, to interpret this packaged data is applied to produce a resulting signal and to send the signal to the presentation program being executed on the host computer. In embodiments in which the gesture-sensing laser pointer packages signals indicating components of gestures, the program 64 includes logic 62 to compare the data from the gesture-sensing laser pointer to signature files for specific gestures, and to produce commands for, and to send the commands to, the presentation program. For instance, the presenter could have moved the laser pointer from left to right indicating that he wants to load the next slide on the display screen. The resulting signal would comprise a “next page” command forwarded to the presentation program, which executes the next page process.
Some examples of the presentation program using the resulting program are discussed. A user using the standard Microsoft PowerPoint presentation can use this technology to hold the laser pointer in his/her hand while making a presentation. Assuming that the presentation is being displayed to an audience on a projector screen or other kind of display screen, the user can casually talk to his audience and flick his hand to the left or right and move to the next slide. The transition seems more smooth and natural than if the presenter had to communicate with another person and that person had to move the slide for the presenter or if the presenter had to break the sequence of his presentation to go click on the computer to move the slide. Another example is a situation where the presenter wants to show the audience a video and wants to skip over unwanted parts of the video. Once again, the presenter can use the laser pointer to issue the rewind and forward commands to the presentation program without having to click on the computer and disrupt the flow of the presentation.
A library of commands with corresponding gestures, and techniques for sensing the gestures is provided in the following table. Of course, the gestures listed can be mapped to a variety of commands, different that those listed in this table. For example, a gesture can be mapped to volume up and volume down commands for presentations that include audio. All of the presentation commands can be programmable.
While the present invention is disclosed by reference to the preferred embodiments and examples detailed above, it is to be understood that these examples are intended in an illustrative rather than in a limiting sense. It is contemplated that modifications and combinations will readily occur to those skilled in the art, which modifications and combinations will be within the spirit of the invention.