The present invention relates to a 3D hand gesture image recognition method and system thereof. Particularly, the present invention relates to the 3D hand gesture image recognition method and system thereof with at least one or a plurality of light field capturing units.
Taiwanese Patent Publication No. M382675, entitled “Hand Gesture Recognition-based Monitoring Camera Control Device,” discloses a control device for outputting commands to turn left, right, upward or downward and to zoom in or out to operate a monitoring camera. A hand gesture-capturing camera is provided to recognize a hand gesture of users for turning left, right, upward or downward and zooming in or out, without the need of operating a mouse or a control lever.
Another Taiwanese Patent Publication No. I298461, entitled “Hand Gesture Recognition System and Method Thereof,” discloses a hand gesture recognition system. A laptop computer includes an image-capturing device with which to directly capture a predetermined hand gesture of a user for conveniently executing an associated application program or an option of functions stored in the laptop computer.
Another Taiwanese Patent Publication No. I395145, entitled “Hand Gesture Recognition System and Method Thereof,” discloses a hand gesture recognition system. The hand gesture recognition system includes a camera device provided to take an image of nature hand gestures, a processor provided to retrieve edges of skin portions from the image and to thereby classify the edges into edge pieces in different degrees, a calculator engine with parallel computing units (PCUs) and predetermined templates of hand gesture database with different degrees provided to search the templates most similar to the edge pieces, means for selecting an optimum template among the most similar templates with the PCUs, and a display terminal provided to display an image of the selected optimum template without using any marker.
Another Taiwanese Patent Publication No. I431538, entitled “Image Based Motion Gesture Recognition Method and System Thereof,” discloses a hand gesture recognition method which includes: receiving a plurality of hand image frames; executing first hand posture detection in the received image frames to obtain a first hand posture; determining the first hand posture to match a predetermined start posture or not; executing hand movement tracking with hand locations in the received image frames to obtain a hand motion gesture if the first hand posture is matched; during the hand movement tracking, further executing second hand posture detection in the received image frames to obtain a second hand posture and terminating the hand movement tracking if the second hand posture matches a predetermined end posture.
Another Taiwanese Patent Publication No. I444907, entitled “Method of Using Singular Value Decomposition for Processing Hand Gesture Images with Complex Background and a System Thereof” discloses a hand gesture image processing method and a system thereof. The method of using singular value decomposition for processing hand gesture images with complex background includes: decomposing an original image in a singular value decomposition manner to obtain an enhanced image; removing dark background from the enhanced image to obtain a skin-like region; removing residual background from the skin-like region. The hand gesture image processing system includes an input unit provided to input the original image, a calculating unit provided to remove dark background from the enhanced image and an output unit provided to output a skin color image.
Another Taiwanese Patent Publication No. I1444908, entitled “Hand Gesture Image Recognition Method and System Using Image Orientation Alignment,” discloses a hand gesture image alignment method and a system thereof. A hand gesture image alignment method includes: decomposing a skin color image in a singular value decomposition manner to obtain an enhanced image; calculating a global centroid in the skin color image; selecting a region of interest (ROI) in the skin color image; selecting a sub-region in the ROI; calculating a local centroid in the sub-region; calculating an alignment angle. The hand gesture image alignment system includes an input unit provided to input the skin color image, a calculating unit provided to select the ROI and the sub-region to calculate the global centroid and the local centroid, thereby calculating the alignment angle, and an output unit provided to output the alignment angle.
Another Taiwanese Patent Publication No. I444909, entitled “Hand Gesture Image Recognition Method and System Using Singular Value Decomposition for Light Compensation,” discloses a hand gesture image compensation method and a system thereof. A hand gesture image recognition method using singular value decomposition for light compensation includes: inputting a hand gesture image; processing the hand gesture image by singular value decomposition; calculating a light compensation coefficient by a light compensation method; compensating light on the hand gesture image by the light compensation coefficient to obtain a light-compensated image. The hand gesture image compensation system includes an input unit provided to input the original image, a calculating unit provided to calculate the light compensation coefficient, thereby processing the image to obtain the light-compensated image, and an output unit provided to output the light-compensated image.
Another U.S. Pat. No. 7,702,130, entitled “User Interface Apparatus Using Hand Gesture Recognition and Method Thereof,” discloses a user interface apparatus and method thereof. The user interface apparatus can control a telematics terminal safely and comfortably while driving, by recognizing a hand gesture image received through a camera in the telematics terminal as a corresponding control signal. The user interface apparatus includes: an input receiving block for receiving a command registration request signal and a command selection signal; a hand gesture recognizing block for storing the hand gesture image in connection with a specific command, and transforming the hand gesture image into the corresponding command by recognizing the hand gesture image from the image obtained in the image obtaining block; and a command performing block for performing an operation corresponding to a command transformed in the hand gesture recognizing block.
Another U.S. Pat. No. 7,680,295, entitled “Hand-gesture Based Interface Apparatus,” discloses a hand-gesture based interface apparatus. The interface is provided that corresponds to an individual person without being restricted to a particular place within a room, by performing gesture recognition while identifying an individual person. A stereo camera picks up an image of a user, and based on the image pickup output, an image processor transmits a color image within a visual field and a distance image to an information integrated recognition device. The information integrated recognition device identifies an individual by the face of the user, senses the position, and recognizes a significant gesture based on a hand sign of the user. The information integrated recognition device executes a command corresponding the identified user and performs operations of all devices to be operated in the room (such as a TV set, an air conditioner, an electric fan, illumination, acoustic condition, and window opening/closing).
Another U.S. Pat. No. 6,215,890, entitled “Hand Gesture Recognizing Device,” discloses a hand gesture recognizing device. The hand gesture recognizing device can correctly recognize hand gestures at high speed without requiring users to be equipped with tools. A gesture of a user is stereoscopically filmed by a photographing device and then stored in an image storage device. A feature image extracting device transforms colors of the stereoscopic image data read from the image storage device in accordance with color transformation tables created by a color transformation table creating device, and disassembles and outputs the feature image of the user in corresponding channels. A spatial position calculating device calculates spatial positions of feature parts of the user by utilizing parallax of the feature image outputted from the feature image extracting device. A region dividing device defines the space around the user with spatial region codes. A hand gesture detecting device detects how the hands of the user move in relation to the spatial region codes. A category is detected first on the basis of the detected hand gesture, and then a sign language word in that category is specified.
Another U.S. Pat. No. 6,002,808, entitled “Hand Gesture Control System,” discloses a hand gesture control system. The system is provided for rapidly recognizing hand gestures for the control of computer graphics, in which image moment calculations are utilized to determine an overall equivalent rectangle corresponding to hand position, orientation and size, with size in one embodiment correlating to the width of the hand. In a further example, a hole generated through the utilization of the touching of the forefinger with the thumb provides a special trigger gesture recognized through the corresponding hole in the binary representation of the hand. In a further example, image moments of images of other objects are detected for controlling or directing onscreen images.
Another U.S. Pat. No. 5,594,469, entitled “Hand Gesture Machine Control System,” discloses a hand gesture machine control system. A system for the control from a distance of machines having displays includes hand gesture detection in which the hand gesture causes movement of an on-screen hand icon over an on-screen machine control icon, with the hand icon moving the machine control icon in accordance with sensed hand movements to effectuate machine control. In an example, TV control led by hand signals includes detecting a single hand gesture and providing a hand icon on the screen along with the provision of icons representing TV controls such as volume, channel, color, density, etc., in which a television camera detects the hand in a noisy background through correlation techniques based on values of local image orientation. In order to trigger the system into operation, a trigger gesture such as the “how” sign is distinguished from the background through the utilization of orientation angle differences. From correlation values based on correlating local orientations between a mask defining a particular hand and the later acquired image of the hand, normalized correlation scores for each pixel are obtained, with the correlation peak being detected and then thresholded to eliminate false alarms.
However, there is a need of improving the conventional hand gesture image recognition method and system for accurately recognizing hand gestures. The above-mentioned patent and patent application publications are incorporated herein by reference for purposes including, but not limited to, indicating the background of the present invention and illustrating the situation of the art.
As is described in greater detail below, the present invention provides a 3D hand gesture image recognition method and system thereof. A light field capturing unit is operated to capture a hand gesture action to thereby obtain at least one 3D hand gesture image. The 3D hand gesture image is projected to a predetermined space to obtain at least one set of eigenvectors which are compared with a plurality of samples to classify and recognize a signal of the 3D hand gesture image in such a way as to improve the reliability of conventional hand gesture image recognition methods.
The primary objective of this invention is to provide a 3D hand gesture image recognition method and system thereof. A light field capturing unit is operated to capture a hand gesture action to thereby obtain at least one 3D hand gesture image. The 3D hand gesture image is projected to a predetermined space to obtain at least one set of eigenvectors which are compared with a plurality of samples to classify and recognize a signal of the 3D hand gesture image. Advantageously, the 3D hand gesture image recognition system and method of the present invention is successful in enhancing the reliability of hand gesture image recognition and increasing recognition rates.
The 3D hand gesture image recognition method in accordance with an aspect of the present invention includes:
operating a light field capturing unit to capture a hand gesture action to thereby obtain at least one 3D hand gesture image;
projecting the at least one 3D hand gesture image to a predetermined space to obtain at least one set of eigenvectors; and
comparing the eigenvectors with a plurality of samples to classify and recognize a signal of the 3D hand gesture image.
In a separate aspect of the present invention, the 3D hand gesture image includes 2D plane information and depth information.
In a further separate aspect of the present invention, the 3D hand gesture image is a 3D contour image, a 3D solid RGB image or combination thereof.
In yet a further separate aspect of the present invention, the 3D solid RGB image is further projected to the predetermined space by a projection color space method, thereby obtaining R channel image information, G channel image information and B channel image information.
In yet a further separate aspect of the present invention, the 3D hand gesture image is projected to the predetermined space by principal component analysis.
In yet a further separate aspect of the present invention, the eigenvectors are compared with the plurality of samples by a k-nearest neighbor method to classify and recognize the signal of the 3D hand gesture image.
The 3D hand gesture image recognition method in accordance with an aspect of the present invention includes:
operating a light field capturing unit to capture a series of hand gesture actions to thereby obtain a first 3D hand gesture image and a second 3D hand gesture image;
projecting the first 3D hand gesture image and the second 3D hand gesture image to a predetermined space to obtain a first set of first eigenvectors and a second set of second eigenvectors;
comparing the first eigenvectors and the second eigenvectors with a plurality of samples to classify and recognize a first signal of the first 3D hand gesture image and a second signal of the second 3D hand gesture image; and
identifying the second signal of the second 3D hand gesture image with the first signal of the first 3D hand gesture image.
In a separate aspect of the present invention, the first 3D hand gesture image and the second 3D hand gesture image include 2D plane information and depth information.
In a further separate aspect of the present invention, the first 3D hand gesture image and the second 3D hand gesture image are 3D contour images, 3D solid RGB images or combination thereof.
In yet a further separate aspect of the present invention, the 3D solid RGB image is further projected to the predetermined space by a projection color space method, thereby obtaining R channel image information, G channel image information and B channel image information.
In yet a further separate aspect of the present invention, the first 3D hand gesture image and the second 3D hand gesture image are projected to the predetermined space by principal component analysis.
In yet a further separate aspect of the present invention, the first eigenvectors and the second eigenvectors are compared with the plurality of samples by a k-nearest neighbor method to classify and recognize the signal of the 3D hand gesture image.
The 3D hand gesture image recognition system in accordance with an aspect of the present invention includes:
a first light field capturing unit provided to capture a hand gesture action to thereby obtain a first 3D hand gesture image;
a calculation unit connected with the first light field capturing unit and provided to project the first 3D hand gesture image to a predetermined space to obtain a first set of first eigenvectors, with further comparing the first eigenvectors with a plurality of samples to classify and recognize a first signal of the first 3D hand gesture image; and
an output unit connected with the calculation unit and provided to output the first signal of the first 3D hand gesture image to a predetermined hand-gesture control device.
In a separate aspect of the present invention, the first 3D hand gesture image includes 2D plane information and depth information.
In a further separate aspect of the present invention, the first 3D hand gesture image is a 3D contour image, a 3D solid RGB image or combination thereof.
In yet a further separate aspect of the present invention, the 3D solid RGB image is further projected to the predetermined space by a projection color space method, thereby obtaining R channel image information, G channel image information and B channel image information.
In yet a further separate aspect of the present invention, the first 3D hand gesture image is projected to the predetermined space by principal component analysis.
In yet a further separate aspect of the present invention, the first eigenvectors are compared with the plurality of samples by a k-nearest neighbor method to classify and recognize the first signal of the first 3D hand gesture image.
In yet a further separate aspect of the present invention, a second light field capturing unit is provided to capture the hand gesture action to thereby obtain a second 3D hand gesture image which is further projected, classified and recognized to obtain a second signal of the second 3D hand gesture image.
In yet a further separate aspect of the present invention, the second signal of the second 3D hand gesture image is identified with the first signal of the first 3D hand gesture image.
Further scope of the applicability of the present invention will become apparent from the detailed description given hereinafter. However, it should be understood that the detailed description and specific examples, while indicating preferred embodiments of the invention, are given by way of illustration only, since various will become apparent to those skilled in the art from this detailed description.
The present invention will become more fully understood from the detailed description given hereinbelow and the accompanying drawings which are given by way of illustration only, and thus are not limitative of the present invention, and wherein:
It is noted that a 3D hand gesture image recognition method and system thereof in accordance with the preferred embodiment of the present invention can be applicable to various apparatus, including computer systems, electric appliance control systems (e.g. IoT (Internet of things)), automatic control systems, medical service systems or security systems, which are not limitative of the present invention.
With continued reference to
Referring again to
In order to retain a degree of data variance, the PCA method is applied to reduce dimensions of high-dimensional data. However, 1D PCA method can convert dimensions of training images into linear dimensions. By way of example, in calculating covariance matrixes, a (m×n) sized image will be calculated to generate a (m×n)×(m×n) matrix which will require a great time for calculating eigenvectors. Accordingly, the original covariance matrix can be reduced to the form
where Ci is a reduced covariance matrix, L is a number of training samples, Xtr is a training image and
Ci=UiΣiViT
where Ui and Vi are orthogonal matrixes and Σi is an eigenvalue matrix of SVD.
The eigenvalue matrix Σi is same with a SVD eigenvalue matrix Σ decomposed from the original covariance matrix C. The eigenvector matrix U is calculated from (Xtr−
Referring again to
where Sk is a similarity matrix, k is a preset number of nearest neighbor training samples, N is a maximum of eigenvectors, Fte is an eigenvector of test samples and Ftr is an eigenvector of training samples. According to k value, a set of k nearest similarity training samples is selected to judge the test data nearly similar to which signal type of the predetermined training samples for predicting hand gesture classification.
In another embodiment, in order to retain complete depth information of the 3D solid RGB image, the 3D solid RGB image is further projected to the predetermined space by a projection color space (PCS) method, thereby obtaining R channel image information, G channel image information and B channel image information.
Referring back to
Although the invention has been described in detail with reference to its presently preferred embodiment, it will be understood by one of ordinary skills in the art that various modifications can be made without departing from the spirit and the scope of the invention, as set forth in the appended claims.