The present invention relates in general to gesture recognition, and more particularly, to user interface, apparatus and method for gesture recognition in an electronic system.
As the range of activities accomplished with a computer increases, new and innovative ways to provide an interface between user and machine are often developed to provide more natural user experience. For example, a touch sensitive screen can allow a user to provide inputs to a computer without a mouse and/or a key board, such that desk area is not needed to operate the computer. Gesture recognition is also receiving more and more attentions due to its potential use in sign language recognition, multimodal human computer interaction, virtual reality, and robot control.
Gesture recognition is a rapidly developing area in the computer world, which allows a device to recognize certain hand gestures of user so that certain functions of the device can be performed based on the gesture. Gesture recognition systems based on computer vision are proposed to facilitate a more ‘natural’, efficient and effective, user-machine interface. In the computer vision, in order to improve the accuracy of gesture recognition, it is necessary to display the related captured video from the camera on the screen. And this type of video can help to indicate to user whether it is possible that his gesture can be recognized correctly and whether he needs to do some adjustment for his position or not. However, the displaying of captured video from the camera usually will have negative impact on watching the current program on the screen for user. Therefore, it is necessary to find one way which can minimize the disturbance to the current program displaying on the screen, and at the same time, keep the high accuracy of recognition.
On the other hand, recently, more and more compound gestures (such as grab and drop) are applied in UI (user interface). These compound gestures usually include multiple sub-gestures and are more difficult to be recognized than simple gesture. Patent US20100050133 “Compound Gesture Recognition” of H.kieth Nishihara et al. filed on Aug. 22, 2008 proposes a method which includes multiple cameras and tries to detect and translate the different sub-gesture into different input for different device. However, the cost and deployment for multiple cameras limit the application of this method in home.
Therefore, it is important to study the compound gesture recognition in the user interface system.
The invention concerns user interface in a gesture recognition system comprising: a display window adapted to indicate a following sub gesture of at least one gesture command, according to at least one sub gesture performed by a user and received by the gesture recognition system previously.
The invention also concerns an apparatus comprising: a gesture predicting unit adapted to predict one or more possible commands to the apparatus based on one or more sub gestures performed by a user previously; a display adapted to indicate the one or more possible commands.
The invention also concerns a method for gesture recognition comprising: predicting one or more possible commands to the apparatus based on one or more sub gestures performed by a user previously; indicating the one or more possible commands.
These and other aspects, features and advantages of the present invention will become apparent from the following description of an embodiment in connection with the accompanying drawings:
It should be understood that the drawing(s) is for purposes of illustrating the concepts of the disclosure and is not necessarily the only possible configuration for illustrating the disclosure.
In the following detailed description, a user interface, apparatus and method for gesture recognition are set forth in order to provide a thorough understanding of the present invention. However, it will be recognized by one skilled in the art that the present invention may be practiced without these specific details or with equivalents thereof. In other instances, well known methods, procedures, components and circuits have not been described in detail as not to unnecessarily obscure aspects of the present invention.
A user can provide simulated inputs to a computer, TV or other electronic device. It is to be understood that the simulated inputs can be provided by a compound gesture, a single gesture, or even any body gesture performed by the user. For example, the user could provide gestures that include pre-defined motion in a gesture recognition environment. The user provides the gesture inputs, for example, by one or both of the user's hands; a wand, stylus, pointing stick; or a variety of other devices with which the user can gesture. The simulated inputs could be, for example, simulated mouse inputs, such as to establish a reference to the displayed visual content and to execute a command on portions of the visual content with which the reference refers.
The user in front of the display screen 102 can provide simulated inputs to the gesture recognition system 100 by an input object. In the embodiment, the input object is demonstrated as a user's hand, such that the simulated inputs can be provided through hand gestures. It is to be understood that the use of a hand to provide simulated inputs via hand gestures is only one example implementation of the gesture recognition system 100. In addition, in the example of performing gestures via a user's hand as the input object to provide simulated inputs, the user's hand could incorporate a glove and/or fingertip and knuckle sensors or could be a user's naked hand.
In the embodiment of
The gesture recognition unit 106, Gesture predictor 105, display controller 104 and gesture database 107 could reside, for example, within a computer (not shown) or in embedded processors, so as to process the respective images associated with the input object to generate control instruction indicated in a display window 103 of the display screen 102.
According to the embodiment, single and compound gesture inputs by users can be recognized. A compound gesture can be a gesture with which multiple sub-gestures can be employed to provide multiple related device inputs. For example, a first sub-gesture can be a reference gesture to refer to a portion of the visual content and a second sub-gesture can be an execution gesture that can be performed immediately sequential to the first sub-gesture, such as to execute a command on the portion of the visual content to which the first sub-gesture refers. The single gesture just includes one sub-gesture, and is performed immediately after the sub-gesture is identified.
As shown in
Returning to
When the gesture images obtained by the camera 101 is recognized by the gesture recognition unit 106, the recognition result for example a predefined sub gesture will be input to gesture predictor 105. Then by looking up gesture database 107 based on the recognition result, the gesture predictor 105 will predict one or more possible gesture commands and the following sub gesture of the possible gesture commands will be shown as an indication in a display window 103. For example, when the first sub gesture “Grab” is recognized, by looking up the database 107, the predictor can draw a conclusion that there are three possible candidates for this compound gesture “grab and drop to left”, “grab and drop to right” and “only grab”.
In the database 107, there are still other single and compound gestures as follows: when the head sub gesture is “wave right hand”, the tail gestures can be “wave right hand”, “wave two hands”, “raise right hand” or “stand still” respectively. For example, the head gesture means turning on TV set. If the tail gesture is “wave right hand”, it means that TV set plays the program from Set-to-box. If the tail gesture is “wave two hands”, it means that TV set plays the program from media server. If the tail gesture is “raise right hand”, it means that TV set plays the program from DVD(digital video disc). If the tail gesture is “wave two hands”, it means that TV set plays the program from media server. If the tail gesture is “stand still”, it means that TV set will not play any program. Although the invention is explained by taking the compound gesture “grab and drop” and two step sub gestures as an example, it cannot be considered a limit to the invention.
According to the embodiment, the display window 103 presenting a user interface of the gesture recognition system 100 is used to indicate the following sub gesture of the one or more possible commands obtained by the gesture predictor 105, along with information on how to perform a following gesture of a complete possible command.
The display window 103 on the display screen 102 is controlled by the display controller 104. The display controller 104 will provide some indications or instructions on how to perform the following sub-gesture for each compound gesture predicted by the gesture predictor 105 according to predefined gestures in the list of database 107, and these indications or instructions are shown in the display window 103 by hints together with information on the commands. For example, the display window 103 on the display screen 102 could highlight a region on the screen as display window to help the user go on his/her following sub-gestures. In this region, several hints for example dotted lines with arrow or curved dotted lines are used to show the following sub gesture of possible commands. The information on the commands includes “grab and drop to left” to guide the user to move hand left, “grab and drop to right” to guide the user to right, and “only grab” to guide the user keeping this grab gesture. In addition, an indication of the sub gesture received by the gesture recognition system 100 is also shown at a corresponding location to the hints in the display window 103. Then indication can be the image received by the system or any images representing the sub gesture. Adobe Flash, Microsoft Silverlight and JavaFX can all be used by the display controller to implement such kind of application as the indication in the display window 103. In addition, the hints are not limited to the above, and can be implemented as any other indications as required by one skilled in the art only if the hints can help users to follow one of them to complete the gesture command.
As shown in
Although the embodiment is described based on the first and second sub gestures, further sub gesture recognition and the hints of its following sub gesture of commands shown in the user interface are also applicable in the embodiment of the invention. If there is no further sub gesture is received or the gesture command is finished, the display window will disappear on the screen.
The foregoing merely illustrates the embodiment of the invention and it will thus be appreciated that those skilled in the art will be able to devise numerous alternative arrangements which, although not explicitly described herein, embody the principles of the invention and are within its spirit and scope.
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/CN10/02206 | 12/30/2010 | WO | 00 | 6/28/2013 |