1. Field of the Invention
The present invention generally relates to a gesture recognition system, and more particularly to a gesture recognition system capable of being performed in a complex scene.
2. Description of Related Art
Natural user interface, or NUI, is a user interface that is invisible and requires no artificial control devices such as a keyboard and mouse. Instead, the interaction between humans and machines is achieved, for example, through hand postures or gestures. Kinect by Microsoft is one example of a vision-based gesture recognition system that uses postures and/or gestures to facilitate interaction between a user and a computer.
Conventional vision-based gesture recognition systems are liable to make erroneous judgments on object recognition owing to surrounding lighting and background objects. After extracting features from a recognized object (a hand in this case), classification is performed via a training set, from which a gesture is recognized. Conventional classification methods suffer either large training data or erroneous judgments due to unclear feature.
For the foregoing reasons, a need has thus arisen to propose a novel gesture recognition system that is capable of more accurately and fast recognizing postures and/or gestures.
In view of the foregoing, it is an object of the embodiment of the present invention to provide a robust gesture recognition system that may perform properly in a complex scene and reduce complexity of posture classification.
According to one embodiment, a gesture recognition system includes a candidate node detection unit, a posture recognition unit, a multiple hands tracking unit and a gesture recognition unit. The candidate node detection unit receives an input image in order to generate a candidate node. The posture recognition unit recognizes a posture according to the candidate node. The multiple hands tracking unit tracks multiple hands by pairing between successive input images. The gesture recognition unit obtains motion accumulation amount according to tracking paths from the multiple hands tracking unit, thereby recognizing a gesture.
Specifically speaking, the color reliability map is generated according to skin color of a captured input image. In the color reliability map, a higher value is assigned to a pixel that is more like the skin color.
The depth reliability map is generated according to hand depth of the input image. In the depth reliability map, a higher value is assigned to a pixel that is within a hand depth range. In one exemplary embodiment, a face is first recognized by a face recognition technique, and the hand depth range is then determined with respect to depth of the recognized face.
The motion reliability map is generated according to motion of a sequence of input images. In the motion reliability map, a higher value is assigned to a pixel that has more motion, for example, measured by sum of absolute differences (SAD) between two input images.
In step 112 (i.e., natural user scenario analysis), weightings of the extracted color, depth and motion are determined with respect to operation status, such as initial statement, motion or whether hand is close to face. Table 1 shows some exemplary weightings:
Finally, in step 113, the color reliability map, the depth reliability map and the motion reliability map are combined with the respective weightings given in step 112, thereby generating a hybrid reliability map, which provides a detected candidate node.
In step 122 (i.e., high accuracy finger recognition), a distance curve is generated by recording relative distances between the center of the segmented palm and perimeter (or boundary) of the segmented palm.
In step 123 (i.e., hierarchical posture recognition), a variety of recognized postures are classified for facilitating the following process.
In the multiple hands tracking unit 13 of
In the gesture recognition unit 14 of
Although specific embodiments have been illustrated and described, it will be appreciated by those skilled in the art that various modifications may be made without departing from the scope of the present invention, which is intended to be limited solely by the appended claims.