The present invention relates generally to interface systems, and specifically to a hand-gesture recognition method.
As the range of activities accomplished with a computer increases, new and innovative ways to provide an interface with a computer are often developed to complement the changes in computer functionality and packaging. For example, touch sensitive screens can allow a user to provide inputs to a computer without a mouse and/or a keyboard, such that desk area is not needed to operate the computer. Examples of touch sensitive screens include pressure sensitive membranes, beam break techniques with circumferential light sources and sensors, and acoustic ranging techniques. However, these types of computer interfaces can only provide information to the computer regarding the touch event, itself, and thus can be limited in application. In addition, such types of interfaces can be limited in the number of touch events that can be handled over a given amount of time, and can be prone to interpret unintended contacts, such as from a shirt cuff or palm, as touch events. Furthermore, touch sensitive screens can be prohibitively expensive and impractical for very large display sizes, such as those used for presentations.
One embodiment of the invention includes a method of providing device inputs. The method includes illuminating hand gestures performed via a bare hand of a user in a foreground of a background surface with at least one infrared (IR) light source. The method also includes generating a first plurality of silhouette images associated with the bare hand based on an IR light contrast between the bare hand and the background surface and generating a second plurality of silhouette images associated with the bare hand based on an IR light contrast between the bare hand and the background surface. The method also includes determining a plurality of three-dimensional features of the bare hand relative to the background surface based on a parallax separation of the bare hand in the first plurality of silhouette images relative to the second plurality of silhouette images. The method also includes determining a provided input gesture based on the plurality of three-dimensional features of the bare hand and comparing the provided input gesture with a plurality of predefined gesture inputs in a gesture library. The method further includes providing at least one device input corresponding to interaction with displayed visual content based on the provided input gesture corresponding to one of the plurality of predefined gesture inputs.
The present invention relates generally to interface systems, and specifically to a hand-gesture recognition method. A user employs a one or more digits of his or her bare hand to provide simulated inputs to a computer or other electronic device. The bare hand can be a user's hand that is unadorned by a special glove or sensors that allow gesture recognition. It is to be understood that as described herein, a digit can include a finger or a thumb. It is also to be understood that the simulated inputs can be provided by simple gestures and/or compound gestures using one or two bare hands. For example, the user could provide gestures that include predefined motion using the bare hand in a gesture recognition environment, such as defined by a foreground of a display screen that displays visual content. The simulated inputs could be, for example, any of a variety of inputs that provide interaction with portions of the visual content.
Any of a variety of gesture recognition interface systems can be implemented to recognize the compound gestures. As an example, one or more infrared (IR) light sources can illuminate a gesture recognition environment that is defined by the area of physical space in a foreground of a vertical or horizontal display surface. The IR light sources can illuminate the gesture recognition environment from above, or can be located beneath an IR-diffusive screen to illuminate a gesture recognition environment that is above the screen from beneath. A set of stereo cameras can each generate a plurality of images of the bare hand. The plurality of images can be, for example, based on a reflected light contrast of the IR light reflected back from the bare hand relative to substantially non-reflected light, based on a more highly reflected light from a retroreflective background surface, or from the IR light being illuminated from beneath an IR diffusive screen. The plurality of images of the bare hand from each camera could be, for example, a plurality of matched sets of images of the bare hand, such that each image in the matched set of images corresponds to the bare hand from a different perspective at substantially the same time. A given matched set of images can be employed to determine a location of the bare hand and the plurality of matched sets of images can be employed to determine physical motion of the bare hand.
A controller can be configured to receive the plurality of images to determine three-dimensional location and feature information associated with the bare hand relative to the background surface. For example, the controller could apply an algorithm to determine features of the bare hand, such as extension, bending, and orientation of fingers in three-dimensional space. The controller could then translate the simulated inputs into device inputs based on the three-dimensional location and feature information. For example, the controller could interpret gesture inputs based on motion associated with the bare hand and translate the gesture inputs into inputs to a computer or other device. The controller could also compare the motion associated with the one or more endpoints of the fingers and/or thumb of the bare hand with a plurality of predefined gestures stored in a gesture library, such that a match with a given predefined gesture could correspond to a particular device input.
A user's hand 24 can provide simulated inputs in the foreground of the vertical display surface 20. The first camera 12 and the second camera 14 each receive separate images of the user's hand 24, where each of the separate images received, respectively, by the first camera 12 and the second camera 14 are a matched set (i.e., matched pair). As an example, each of the first camera 12 and the second camera 14 could rapidly take still photograph images at, for example, sixty times per second, such that each still photograph image taken by the first camera 12 is matched to a still photograph image taken by the second camera 14 at substantially the same time. The user's hand 24 can appear to be in a different location in each image of the matched set captured by each of the first camera 12 and the second camera 14, respectively, due to parallax caused by the different mounted locations of each of the first camera 12 and the second camera 14. In the example of
In the example of
The first camera 12 and the second camera 14 can each provide their respective separate images of the user's hand 24 to a controller 26. The controller 26 could reside, for example, within a computer (not shown) for which the gesture recognition interface system 10 is designed to provide a gesture recognition interface. It is to be understood, however, that the hosting of a controller is not limited to a standalone computer, but could be included in embedded processors. The controller 26 can process the respective images associated with the user's hand 24 to generate three-dimensional feature data associated with the user's hand 24.
For example, each of the first camera 12 and the second camera 14 could each be mounted at predetermined angles relative to the floor 28 beneath the vertical display surface 20. For a given matched pair of images of the user's hand 24, if the predetermined angles of each of the cameras 12 and 14 are equal, then each point of the user's hand 24 in two-dimensional space in a given image from the camera 12 is equidistant from a corresponding point of the user's hand 24 in the respective matched image from the camera 14. As such, the controller 26 could determine the three-dimensional physical location of the user's hand 24 based on a relative parallax separation of the matched set of images of the user's hand 24 at a given time. In addition, using a computer algorithm, the controller 26 could also determine the three-dimensional physical location of features associated with portions of the user's hand 24, such as fingers and fingertips. As an example, the controller 26 can be configured to determine and interpret the gestures that are provided in the gesture recognition environment in any of a variety of ways, such as those described in either of U.S. patent applications entitled “Gesture Recognition Interface System”, Ser. No. 11/485,788, filed Jul. 13, 2006, and “Gesture Recognition Interface System with Vertical Display”, Ser. No. 12/133,836, filed Jun. 5, 2008, each assigned to the same assignee as the Present Application and incorporated herein by reference in its entirety.
The gesture recognition interface system 10 can also include a projector 30. The projector 30 can provide visual content with which the user can interact and provide inputs. In the example of
As an example, the controller 26 can determine gesture inputs that are performed by a user using the user's hand 24 and can translate the gesture inputs into any of a variety of device inputs. For example, the controller 26 could interpret pointing at the vertical display surface 20 by the user's hand 24, such as with an extended index finger or a flat, open hand, to establish a reference on the visual content that is displayed on the vertical display surface 20. As another example, selected data can be manipulated in response to the gesture inputs, such as selecting objects, moving objects, scrolling visual data, turning pages in a document, and any of a variety of other device inputs. Additional gesture inputs can be sequentially implemented using the user's hand 24, or sequential device inputs can be implemented using the user's other hand (not shown). Accordingly, a user of the gesture recognition interface system 10 could navigate through a number of computer menus, graphical user interface (GUI) icons, and/or execute programs associated with a computer merely by moving his or her bare hand through the air in the gesture recognition environment 22 and initiating gesture inputs without touching a mouse, keyboard, or any other computer input device.
In the example of
It is to be understood that the gesture recognition interface system 10 is not intended to be limited to the example of
The gesture recognition interface controller 50 receives a plurality of images from each of the first camera 12 and the second camera 14. The first and second cameras 12 and 14 each receive a plurality of images of the user's hand 24. The respective images of the user's hand 24 could be silhouette images that are based on a brightness contrast between the user's hand 24 and a background surface. For example, the brightness contrast can be based on the user's hand 24 being substantially bright based on reflecting IR illumination in the foreground of a substantially dark background surface, such as the floor 28. However, as described in greater detail below with regard to the examples of
The first and second cameras 12 and 14 each input their respective images of a matched pair of images into a digitizer 52. The digitizer 52 produces digitized versions of the images of the user's hand 24. The digitized images of the user's hand 24 are input to an image comparator 54. The image comparator 54 compares each of the digitized images of the user's hand 24 to a previously stored digitized image of the user's hand 24 to generate a binarized image of the user's hand 24. As an example, although the vertical display surface 20 emits the IR illumination in a substantially more uniform manner, there may still be illumination intensity gradients across the vertical display surface 20. Therefore, the comparison by the image comparator 54 allows for an improved quality of the digitized images despite variations in illumination intensity across the vertical display surface 20. As an example, the previously stored digitized image could have been captured during a calibration operation and/or from repeatedly storing the digitized image in memory buffers.
As an example, a background model can be maintained for each of the cameras 12 and 14 without the user's hand 24 being present. The background model images can be used to decide at each pixel whether the images of the user's hand 24 correspond to a binary 1 or 0. For example, at each pixel location, if the image of the user's hand 24 has a value that is approximately greater than the corresponding background model image times a threshold scaling value of between 0 and 1, the output value will be a binary 1, thus denoting the presence of the user's hand 24. In this manner, the scaling value can be selected to provide an optimal balance between desirably detecting the user's hand 24 while being substantially insensitive to residual variations in illumination intensity from the top-surface of the vertical display surface 20. As a result, edges associated with the user's hand 24 can be detected in each of the images.
The contrast enhanced binarized silhouette images of the user's hand 24 are then each input to an object detection algorithm device 56. The object detection algorithm device 56 can be an integrated circuit (IC) or set of ICs within the gesture recognition interface controller 50, or could be a software routine residing in the gesture recognition interface controller 50. The object detection algorithm device 56 can include any of a variety of detection algorithms for determining a two-dimensional location of the user's hand 24 relative to the vertical display surface 20. As an example, the object detection algorithm device 56 can include a two-dimensional convolution filter, such as a Laplacian of Gaussian convolution filter, that applies a mathematical algorithm to each of the digitized images of the user's hand 24 to determine the location of one or more features of the user's hand 24, such as fingertips, in two-dimensional space, as described in U.S. patent application Ser. No. 11/485,788. As another example, the object detection algorithm device 56 can generate a sequence of mid-points that are symmetrically disposed between edges along elongated portions of the user's hand 24 to form a skeletal representation of the user's hand 24, as described in U.S. patent application Ser. No. 12/133,836. Accordingly, features associated with the user's hand 24 that can be determinative of input gestures can be detected in each of the silhouette images of the user's hand 24.
The output data of the object detection algorithm device 56, such as a convolution filtered image or a skeletal representation image, is provided to a calibration data and location resolver 58. The calibration data and location resolver 58 determines a three-dimensional location of the features associated with the user's hand 24 at a given time. As an example, the calibration data and location resolver 58 can be configured to determine a parallax separation between end-points that are determined, for example, by a peak detector from convolution filtered images provided from each of the cameras 12 and 14. As another example, the calibration data and location resolver 58 can be configured to compare the relative two-dimensional locations of the corresponding points in each of the skeletal representation images of the user's hand 24 and to interpolate a three-dimensional location of the points based on a parallax separation of the corresponding points. Accordingly, the calibration data and location resolver 58 can determine the three-dimensional location of the one or more features associated with the user's hand 24 relative to the vertical display surface 20 for each matched pair of images provided by the cameras 12 and 14.
The data output from the calibration data and location resolver 58 is input to a gesture recognition device 60. The gesture recognition device 60 interprets the three-dimensional feature data associated with the user's hand 24 and translates changes in the location data into an input gesture. In the example of
The gesture recognition interface controller 50 also includes a gesture library 64 coupled to the gesture recognition device 60. The gesture library 64 could include a plurality of predefined gestures, with each of the predefined gestures corresponding to a particular device input. For example, the gesture library 64 could include a database of specific arrangements and combinations of fingertip positions and sequential motions that each correspond to a different computer input. Examples of the predefined input gestures are described in greater detail below in the examples of
The gesture recognition device 60, upon analyzing the sequential sets of the three-dimensional feature data in the image buffer 62, could poll the gesture library 64 to determine if the gesture input matches a predefined gesture. Upon determining a match, the gesture recognition device 60 could translate the gesture input into the device input that corresponds to the predefined gesture. The gesture library 64 could be pre-programmed with the appropriate predefined gesture inputs, or it could be dynamically programmable, such that new gestures can be added, along with the corresponding device inputs. For example, a user could activate a “begin gesture sample” operation, perform the new gesture, capture the appropriate images of the new gesture using the first camera 12 and the second camera 14, and input the appropriate device input for which the new gesture corresponds.
It is to be understood that a given gesture recognition interface system is not intended to be limited by the example of
The gesture recognition interface system 100 includes a first camera 102 and a second camera 104. Coupled to each of the first camera 102 and the second camera 104, respectively, is a first IR light source 106 and a second IR light source 108. The first camera 102 and the second camera 104 may each include an IR filter, such that the respective camera may pass IR light and substantially filter other light spectrums. The first IR light source 106 and the second IR light source 108 each illuminate a horizontal background surface 110 which can be retroreflective. As such, IR light from the first IR light source 106 can be reflected substantially directly back to the first camera 102 and IR light from the second IR light source 108 can be reflected substantially directly back to the second camera 104. Accordingly, an object that is placed above the background surface 110 may reflect a significantly lesser amount of IR light back to each of the first camera 102 and the second camera 104, respectively. Therefore, such an object can appear to each of the first camera 102 and the second camera 104 as a silhouette image, such that it can appear as a substantially darker object in the foreground of a highly illuminated background surface 110. It is to be understood that the background surface 110 may not be completely retroreflective, but may include a Lambertian factor to facilitate viewing by users at various angles relative to the background surface 110.
A user's bare hand 112 can provide simulated inputs over the background surface 110. In the example of
The first camera 102 and the second camera 104 can each provide their respective separate silhouette images of the user's hand 112 to a controller (not shown), similar to as described above in the example of
The gesture recognition interface system 100 can also include a projector 114 configured to project image data. The projector 114 can provide an output interface, such as, for example, computer monitor data, for which the user can interact and provide inputs using the user's hand 112. In the example of
It is to be understood that the gesture recognition interface system 100 is not intended to be limited to the example of
The gesture recognition simulation system 150 includes a three-dimensional display system 158, demonstrated in the example of
A user's hand 164 can be used to provide hand-gesture inputs over the retroreflective surface 156. To provide the interaction between the user's hand 164 and the given functional component 162, an associated controller (not shown) can detect a three-dimensional physical location of one or more features of the user's hand 164, such as described above in the example of
As an example, a user of the gesture recognition simulation system 150 could perform a reference sub-gesture with the user's hand 164 to refer to one of the functional components 162, demonstrated in the example of
The gesture recognition simulation system 150 is demonstrated as yet another example of the use of hand-gesture inputs in providing device inputs to a computer. It is to be understood that the gesture recognition simulation system 150 is not intended to be limited to the example of
The examples of
The hand gestures 200 also include gestures 204 and 206 in which the user's hand 24 is demonstrated as a flat, open hand. Specifically, the gesture 204 demonstrates the flat, open hand with all fingers extended and joined together and the thumb retracted alongside the hand 24, and the gesture 206 demonstrates the flat, open hand with all fingers and the thumb extended and separated. As an example, each of the gestures 204 and 206 can also be implemented to point to portions of displayed visual content or portions of the background surface corresponding to the displayed visual content. Similar to the gesture 202, the controller 26 can be configured to provide a visual reference on the corresponding portion of the displayed visual content in response to the gestures 204 and 206. As an example, the controller 26 could provide less specific reference on the visual content, such as to switch a focus between two open windows on the displayed visual content, or to display a translucent “spot-light” image on the displayed visual content, in response to either of the gestures 204 and 206.
The hand gestures 200 also include gestures 208, 210, 212, and 214 that involve contact of the user's hand 24 with the background surface or display surface. Each of the gestures 208, 210, 212, and 214 demonstrate contact with the background surface or display surface with either a single digit while other, non-contacting digits are extended or with multiple digits. Specifically, the gesture 208 demonstrates contact with the extended index finger with the non-contacting thumb also extended and the gesture 210 demonstrates contact with the index finger with the all other non-contacting fingers and the thumb also extended. In addition, the gesture 212 demonstrates contact with the extended thumb with all non-contacting fingers also extended and the gesture 214 demonstrates contact with the extended index and middle fingers with the rest of the fingers and thumb retracted. As an example, the controller 26 can provide a left mouse-click input in response to contact of a single extended finger with the background surface or display surface, such as resulting from contacting the visual content while maintaining the gesture 202. However, the controller 26 can provide other interactive device inputs or other mouse-click inputs in response to one or more of the gestures 208, 210, 212, and 214. For example, the gesture 208 can correspond to a right mouse-click input or selective erasing of visual data on a virtual dry-erase board. As another example, the gestures 210, 212, and/or 214 can each have other associated device inputs corresponding to the portion of the visual content that is contacted by the user's hand 24, such as a middle mouse-button click or assigning other distinct actions or meanings.
The hand gestures 200 also include a gesture 216 in which the user's hand 24 is demonstrated as a flat, open hand. In the example of
The hand gestures 200 further include a gesture 218 in which the user's hand 24 is demonstrated as having an extended index finger and an extended thumb. As an example, similar to the gesture 202, the extended index finger in the gesture 218 could point to a portion of displayed visual content or the background surface. The gesture 218 can also include the user moving the thumb in a further extending or further retracting position, such that the user changes the angle θ between the extended index finger and thumb. The changing of the angle θ could correspond to a zoom command, such that the controller 26 is configured to change a zoom associated with the selected portion of the visual content. For example, the user could increase the angle θ to zoom-out to make the selected portion of the visual content smaller and could decrease the angle θ to zoom-in to make the selected portion of the visual content larger, or vice-versa.
The hand gestures 250 also include gestures 258 and 260 in which the user's hand 24 begins positioned in the flat, open hand with all fingers extended and joined together and the thumb retracted alongside the hand 24. While maintaining the user's hand 24 in the flat, open hand position, the user then rotates the wrist approximately 180 degrees. The gesture 258 could also include a slight lateral motion of the user's hand 24 during the rotation, such as to simulate a stationary hinge on one side of the user's hand 24. In response to the gesture 258, the controller 26 can be configured to page-up or page-down in a selected document in the visual content, such that the gesture 258 intuitively simulates the turning of a page in a book. Therefore, the orientation of the palm of the user's hand 24, as well as the direction of rotation, can be indicative of whether the user is paging forward or paging backward. The gesture 260 is demonstrated as substantially the same as the gesture 258, except that the gesture 260 also includes a forward motion of the user's hand 24, such that the gesture 260 can be an alternative to the gesture 258 or can result in a similar function. As an example, the gesture 260 can correspond to the paging forward or backward by multiple pages instead of one page in the document.
In addition, the gestures 258 and 260 can also include subsequent gestures that can provide different device inputs via the controller 26. For example, upon implementing either of the gestures 258 or 260, the user's hand 24 could be maintained in the flat, open hand position and moved upward or downward to correspond to a slow scrolling of the document in the visual content. As another example, while substantially maintaining the flat, open hand position of the user's hand 24, the user can implement another gesture with the user's hand 24 or with the opposing hand 25 (not shown in the gestures 258 and 260), to page-up or page-down in the document again. For example, upon implementing the gesture 258 or 260, the user could flick the index finger, extend and retract the thumb, or perform a gesture with his or her other hand to repeat the page-up or page-down device input of the gesture 258 or the gesture 260. As yet a further example, the user could extend two or three fingers of the opposing hand 25 for a duration of time while substantially maintaining the flat, open hand position of the user's hand 24. As such, the controller 26 can be configured to initiate auto-turning of pages of the document while substantially maintaining the flat, open hand position of the user's hand 24, such that an acceleration and deceleration of the auto-turning can be controlled proportional to the duration of extension of the two or three fingers of the opposing hand 25.
The hand gestures 250 also include a gesture 262 that includes both of the user's hands 24 and 25. In the gesture 262, the user's hands 24 and 25 are each positioned with the index finger extended and pointing to the background surface and/or visual content. The index fingers could be in contact with the background surface and/or visual content or could be in free space in the foreground of the background surface and/or visual content. The ends of the index fingers are demonstrated at a distance “X” apart from each other at the beginning of the gesture 262. The user could then move the hands 24 and 25 laterally relative to each other to change the distance from “X” to “Y”. In the example of
As an example, in the gesture 302, the user's hand 24 can be moved in the X, Y, and/or Z direction while maintaining the closed-fist. In response, the controller 26 can be configured to move the corresponding portion of the visual content in the corresponding X, Y, and/or Z direction. As another example, in the gestures 304 and 306, the user's hand 24 can be rotated based on a rotation at the wrist while maintaining the thumb and finger contact of the gesture 304 or the claw-like grip of the gesture 306. In response, the controller 26 can be configured to rotate the corresponding portion of the visual content in the direction of rotation of the user's hand 24. In addition, in the gestures 302, 304, and 306, the user's hand 24 could be both moved and rotated, such that the controller could likewise both move and rotate the corresponding portion of the visual content.
The hand gestures 300 also include a gesture 308 in which the user's hand 24 is positioned with both an extended index finger and an extended thumb and a gesture 310 in which the user's hand 24 is positioned with an extended index finger, an extended middle finger, and an extended thumb. As an example, the gestures 308 and 310 could be implemented to point the extended index finger in the gesture 308 or both extended fingers in the gesture 310 to a portion of background surface and/or visual content. The user can then retract the thumb to initiate a tilt and/or pan command of the corresponding portion of the visual content to which the extended finger or fingers point. Therefore, the portion of the visual content can be tilted and/or panned based on rotation of the wrist and/or orientation of the index finger as long as the thumb remains retracted. Accordingly, the gestures 308 and 310 can be compound gestures, such as to include a reference sub-gesture (i.e., the pointing of the finger(s) to refer to the portion of the visual content) and an execution sub-gesture (i.e., the retraction of the thumb to initiate the pan and/or tilt).
The hand gestures 300 also include a gesture 312 in which the user's hand 24 is positioned with an extended index finger, an extended little finger, and an extended thumb. As an example, the extended fingers can refer to a portion of the visual content, such as a menu or document. The user can then retract the thumb to initiate a speed scrolling command of the corresponding portion of the visual content, such as similar to depressing the scroll wheel of a mouse. The speed scrolling can be initiated in a specific direction (e.g., forward) without additional motion of the user's hand 24, or can be direction specific based on motion of the hand 24 upward or downward. As an example, the speed scrolling can continue until the user extends the thumb again. Therefore, similar to as described above with respect to the gesture 310, the gesture 312 can be a compound gesture.
The hand gestures 350 include a gesture 354 in which the user's hand 24 is positioned such that an extended index finger can point to a portion of the visual content as a reference sub-gesture. The gesture 354 also demonstrates that the thumb is likewise extended, although it could also be retracted. As an example, the portion of the visual content can be an Internet browser window. The gesture 354 also includes an execution sub-gesture in which the index finger of the user's hand 24 is “triggered”, such that it is rapidly bent at a distal knuckle and re-extended. In response to the triggering of the index finger, the controller 26 can be configured to return the Internet browser to an immediately preceding webpage, such as similar to clicking on a “Back” button of a given Internet browser. The gesture 354 can also include other components, such as a triggering (e.g., rapid retraction and re-extension) of the thumb being equivalent to a “Forward” button, or any of a variety of other gesture options.
The hand gestures 350 include gestures 356 and 358. The gesture 356 is demonstrated with the user's hand 24 while the gesture 358 is demonstrated as incorporating both of the user's hands 24 and 25. The gestures 356 and 358 each include a reference sub-gesture, demonstrated with the gesture 356 as a pointing of the index finger toward a portion of the visual content and demonstrated with the gesture 358 as the index finger of each of the user's hands 24 and 25 longitudinally joined together and pointing to a portion of the visual content. As an example, the user's hands 24 and 25 can also be held together, such as palm-to-palm, and could include interlocking of the remaining fingers.
The gestures 356 and 358 also each include an execution sub-gesture. The execution sub-gesture of the gesture 356 is demonstrated as a sudden drop of the user's hand 24 in the downward direction, followed by a slight upward motion, all while maintaining the extension of the index finger. The execution sub-gesture of the gesture 358 is demonstrated as a rapid jabbing motion with both of the user's hands 24 while maintaining the joining and pointing of the index fingers, such that the hands 24 and 25 are rapidly moved forward then backward. As an example, the execution sub-gestures of each of the gestures 356 and 358 can correspond to selection of the portion of the visual content for further interaction. As another example, the execution sub-gesture of the gesture 356 can correspond to movement of an entity in a computer assisted drawing (CAD) application, such as in the direction of the reference sub-gesture, or can correspond to sending a selected video feed to another display to which the index finger points.
The hand gestures 200, 250, 300, and 350 are not intended to be limited to the examples of
In view of the foregoing structural and functional features described above, a methodology in accordance with various aspects of the present invention will be better appreciated with reference to
At 408, a plurality of three-dimensional features of the bare hand are determined relative to the background surface based on a parallax separation of the bare hand in the first plurality of silhouette images relative to the second plurality of silhouette images. The first and second pluralities of images can form matched sets of images that demonstrate the bare hand having a relative parallax separation that is determinative of the three-dimensional features. The three-dimensional features can include location, pitch and/or orientation of the hand, fingers, and fingertips. At 410, a provided input gesture is determined based on the plurality of three-dimensional features of the bare hand. The matched sets of images can be buffered, such that sequential images can be determinative of motion of the hand.
At 412, the provided input gesture is compared with a plurality of predefined gesture inputs in a gesture library. The gesture library can be a memory having the predefined gestures. The predefined gestures can each correspond to a specific device input. The predefined gestures can correspond to any of the hand gestures 200, 250, 300, and 350 in the example of
What have been described above are examples of the present invention. It is, of course, not possible to describe every conceivable combination of components or methodologies for purposes of describing the present invention, but one of ordinary skill in the art will recognize that many further combinations and permutations of the present invention are possible. Accordingly, the present invention is intended to embrace all such alterations, modifications and variations that fall within the spirit and scope of the appended claims.
This application is a continuation-in-part of U.S. patent application Ser. No. 11/485,788, filed Jul. 13, 2006, and a continuation-in-part of U.S. patent application Ser. No. 12/196,767, filed Aug. 22, 2008. The subject matter of the aforementioned applications is incorporated herein by reference in its entirety.
Number | Name | Date | Kind |
---|---|---|---|
4468694 | Edgar | Aug 1984 | A |
4843568 | Krueger et al. | Jun 1989 | A |
4924506 | Crossley et al. | May 1990 | A |
5220441 | Gerstenberger | Jun 1993 | A |
5239373 | Tang et al. | Aug 1993 | A |
5475422 | Mori et al. | Dec 1995 | A |
5483261 | Yasutake | Jan 1996 | A |
5528263 | Platzker et al. | Jun 1996 | A |
5563988 | Maes et al. | Oct 1996 | A |
5805167 | van Cruyningen | Sep 1998 | A |
5913727 | Ahdoot | Jun 1999 | A |
5999185 | Kato et al. | Dec 1999 | A |
6002808 | Freeman | Dec 1999 | A |
6128003 | Smith et al. | Oct 2000 | A |
6147678 | Kumar et al. | Nov 2000 | A |
6195104 | Lyons | Feb 2001 | B1 |
6204852 | Kumar et al. | Mar 2001 | B1 |
6208329 | Ballare | Mar 2001 | B1 |
6222465 | Kumar et al. | Apr 2001 | B1 |
6313853 | Lamontagne et al. | Nov 2001 | B1 |
6327381 | Rogina et al. | Dec 2001 | B1 |
6353428 | Maggioni et al. | Mar 2002 | B1 |
6359612 | Peter et al. | Mar 2002 | B1 |
6385331 | Harakawa et al. | May 2002 | B2 |
6434255 | Harakawa | Aug 2002 | B1 |
6512507 | Furihata et al. | Jan 2003 | B1 |
6624833 | Kumar et al. | Sep 2003 | B1 |
6681031 | Cohen et al. | Jan 2004 | B2 |
6695770 | Choy et al. | Feb 2004 | B1 |
6714901 | Cotin et al. | Mar 2004 | B1 |
6720949 | Pryor et al. | Apr 2004 | B1 |
6788809 | Grzeszczuk et al. | Sep 2004 | B1 |
6796656 | Dadourian | Sep 2004 | B1 |
6806849 | Sullivan | Oct 2004 | B2 |
6857746 | Dyner | Feb 2005 | B2 |
6950534 | Cohen et al. | Sep 2005 | B2 |
6956573 | Bergen et al. | Oct 2005 | B1 |
6983065 | Akgul et al. | Jan 2006 | B1 |
7042440 | Pryor et al. | May 2006 | B2 |
7129927 | Mattsson | Oct 2006 | B2 |
7259747 | Bell | Aug 2007 | B2 |
7348963 | Bell | Mar 2008 | B2 |
7411575 | Hill et al. | Aug 2008 | B2 |
7598942 | Underkoffler et al. | Oct 2009 | B2 |
7701439 | Hillis et al. | Apr 2010 | B2 |
7956847 | Christie | Jun 2011 | B2 |
20010006426 | Son et al. | Jul 2001 | A1 |
20010043719 | Harakawa et al. | Nov 2001 | A1 |
20020090146 | Heger et al. | Jul 2002 | A1 |
20020093666 | Foote et al. | Jul 2002 | A1 |
20020122113 | Foote | Sep 2002 | A1 |
20020126161 | Kuzunuki et al. | Sep 2002 | A1 |
20020186221 | Bell | Dec 2002 | A1 |
20020186251 | Himmel et al. | Dec 2002 | A1 |
20030058341 | Brodsky et al. | Mar 2003 | A1 |
20030067537 | Myers | Apr 2003 | A1 |
20030085866 | Bimber | May 2003 | A1 |
20030156756 | Gokturk et al. | Aug 2003 | A1 |
20030218761 | Tomasi et al. | Nov 2003 | A1 |
20040046747 | Bustamante | Mar 2004 | A1 |
20040108990 | Lieberman et al. | Jun 2004 | A1 |
20040113885 | Genc et al. | Jun 2004 | A1 |
20040125207 | Mittal et al. | Jul 2004 | A1 |
20040183775 | Bell | Sep 2004 | A1 |
20040193413 | Wilson et al. | Sep 2004 | A1 |
20040239761 | Jin et al. | Dec 2004 | A1 |
20050002074 | McPheters et al. | Jan 2005 | A1 |
20050012817 | Hampapur et al. | Jan 2005 | A1 |
20050052714 | Klug et al. | Mar 2005 | A1 |
20050068537 | Han et al. | Mar 2005 | A1 |
20050088714 | Kremen | Apr 2005 | A1 |
20050110964 | Bell et al. | May 2005 | A1 |
20050151850 | Ahn et al. | Jul 2005 | A1 |
20050166163 | Chang et al. | Jul 2005 | A1 |
20050275628 | Balakrishnan et al. | Dec 2005 | A1 |
20050285945 | Usui et al. | Dec 2005 | A1 |
20050286101 | Garner et al. | Dec 2005 | A1 |
20060010400 | Dehlin et al. | Jan 2006 | A1 |
20060026536 | Hotelling et al. | Feb 2006 | A1 |
20060036944 | Wilson | Feb 2006 | A1 |
20060052953 | Vilanova et al. | Mar 2006 | A1 |
20060092178 | Tanguay | May 2006 | A1 |
20060125799 | Hillis et al. | Jun 2006 | A1 |
20060187196 | Underkoffler et al. | Aug 2006 | A1 |
20060203363 | Levy-Rosenthal | Sep 2006 | A1 |
20060209021 | Yoo et al. | Sep 2006 | A1 |
20070024590 | Krepec | Feb 2007 | A1 |
20070064092 | Sandbeg et al. | Mar 2007 | A1 |
20070252898 | Delean | Nov 2007 | A1 |
20070257891 | Esenther et al. | Nov 2007 | A1 |
20080005703 | Radivojevic et al. | Jan 2008 | A1 |
20080013793 | Hillis et al. | Jan 2008 | A1 |
20080013826 | Hillis et al. | Jan 2008 | A1 |
20080028325 | Ferren et al. | Jan 2008 | A1 |
20080043106 | Hassapis et al. | Feb 2008 | A1 |
20080055247 | Boillot | Mar 2008 | A1 |
20080150913 | Bell et al. | Jun 2008 | A1 |
20080244468 | Nishihara et al. | Oct 2008 | A1 |
20090015791 | Chang et al. | Jan 2009 | A1 |
20090115721 | Aull et al. | May 2009 | A1 |
20090116742 | Nishihara | May 2009 | A1 |
20090128516 | Rimon et al. | May 2009 | A1 |
20090315740 | Hildreth et al. | Dec 2009 | A1 |
20090316952 | Ferren et al. | Dec 2009 | A1 |
20100027843 | Wilson | Feb 2010 | A1 |
20100050133 | Nishihara et al. | Feb 2010 | A1 |
Number | Date | Country |
---|---|---|
197 39 285 | Nov 1998 | DE |
0 571 702 | Dec 1993 | EP |
0 571 702 | Dec 1993 | EP |
0 913 790 | May 1999 | EP |
1 223 537 | Dec 2001 | EP |
1 689 172 | Aug 2006 | EP |
1 879 129 | Jan 2008 | EP |
1 879 130 | Jan 2008 | EP |
2 056 185 | May 2009 | EP |
2 068 230 | Jun 2009 | EP |
2460937 | Dec 2009 | GB |
62264390 | Jan 1987 | JP |
4271423 | Feb 1991 | JP |
04031996 | Feb 1992 | JP |
WO 9813746 | Apr 1998 | WO |
WO 0002187 | Jan 2000 | WO |
WO 0021023 | Apr 2000 | WO |
WO 0055802 | Sep 2000 | WO |
WO 03026299 | Mar 2003 | WO |
WO 2008001202 | Jan 2008 | WO |
Entry |
---|
Baudel et al. “Charade: remote control of objects using free-hand gestures.” Communications of the ACM—Special issue on computer augmented environments: back to the real word. Jul. 1993. vol. 36. p. 28-35. |
Ishibuchi et al. “Real Time Hand Gesture Recognition Using 3D Prediction Model.” Proceedings of the International Conference on Systems, Man and Cybernetics. Oct. 17-20, 1993. vol. 17. p. 324-328. |
Cutler et al. “Two-Handed Direct Manipulation on the Responsive Workbench.” I3D '97 Proceedings of the 1997 symposium on Interactive 3D graphics. ACM. 1997. p. 107-114. |
Segen et al. “GestureVR: Vision-Based 3D Hand Interface for Spatial Interaction.” Multimedia '98 Proceedings of the sixth ACM international conference on Multimedia. ACM. 1998. p. 455-464. |
Manresa, et al. “Hand Tracking and Gesture Recognition for Human-Computer Interaction.” Electronic Letters on Computer Vision and Image Analysis. Computer Vision Center. 2005. pp. 96-104. |
Bretzner, et al.: “Hand Gesture Recognition Using Multi-Scale Colour Features, Hierarchical Models and Particle Filtering”; Automatic Face and Gesture Recognition, 2002, Proceedings. Fifth IEEE International Conference on, IEEE, Piscataway, NJ, USA, May 20, 2002, pp. 423-428, XP010949393, ISBN: 978-0-7695-1602-8, p. 2. |
British Search Report for corresponding GB 0909597.7 completed Sep. 17, 2009. |
British Search Report for corresponding GB0910067.8, completed Oct. 15, 2009. |
Dubois, et al.: “In Vivo Measurement of Surgical Gestures”; IEEE Transactions on Biochemical Engineering, vol. 49, No. 1, Jan. 2002, pp. 49-54. |
EP Search Report for corresponding EP 07 25 2716, completed Jun. 4, 2010, The Hague. |
EP Search Report for corresponding EP 07 25 2870 completed Aug. 16, 2010 by Suphi Umut Naci of the Hague. |
European Search Report for corresponding EP 07 25 2717 completed Sep. 27, 2007 by Martin Müller of the EPO. |
Fiorentino, et al.: “Spacedesign: A Mixed Reality Workspace for Aesthetic Industrial Design”; Mixed and Augmented Reality, 2002. ISMAR 2002. Proceedings. International Symposium on Sep. 30-Oct. 1, 2002, Piscataway, NJ, USA, IEEE, Sep. 30, 2002, pp. 86-318, XP010620945, ISBN: 0-7695-1781-1; Abstract, Figs. 1, 2; p. 86, left-hand col., ¶4; p. 87, left-hand col., ¶4-right-hand col. |
German Office Action for corresponding DE 10 2009 034 413.6-53, issued Sep. 29, 2010. |
Hartley, et al.: “Multiple View Geometry in Computer Vision, Structure Computation”; Jul. 31, 2000, Multiple View Geometry in Computer Vision, Cambridge University Press, GB, pp. 295-311, XP002521742, ISBN: 9780521623049, pp. 295-311, figures 11.1, 11.2 & 11.7. |
Ishibuci, et al.: “Real Time Hand Gesture Recognition Using 3D Prediction Model”; Proceedings of the International Conference on Systems, Man and Cybernetics. Le Touquet, Oct. 17-20, 1993; New York, IEEE, US LNKD-DOI: 10.1109/ICSMC. 1993. 390870, vol. -, Oct. 17, 1993, pp. 324-328, XP010132504, ISBN: 978-0-7803-0911-1, pp. 325; figures 1-4. |
Kjeldsen, et al.: “Toward the Use of Gesture in Traditional User Interfaces”; Automatic Face and Gesture Recognition, 1996, Proceedings of the Second International Conference on Killington, VT, USA 14-16 19961014' Los Alamitos, CA, USA, IEEE Comput. Soc., ISBN 978-0-8186-7713-7; whole document. |
Korida, K et al: “An Interactive 3D Interface for a Virtual Ceramic Art Work Environment”; Virtual Systems and Multimedia, 1997. VSMM '97. Proceedings., International Conference on Geneva, Switzerland Sep. 10-12, 1997, Los Alamitos, CA, USA, IEEE Comput. Soc, US, Sep. 10, 1997, pp. 227-234, XP010245649, ISBN: 0-8186-8150-0; Abstract, Figs. 1, 2, 5, 7-11. |
Leibe, et al.: “Toward Spontaneous Interaction with the Perceptive Workbench”; IEEE Computer Graphics and Applications; p. 54-65XP-000969594; Nov./Dec. 2000. |
Mitchell: “Virtual Mouse”; ip.com Inc, West Henrietta, NY, US, May 1, 1992 ISSN 1533-0001; whole document. |
Office Action for corresponding DE 10 2009 025 236.3, issued May 2010. |
Pajares, et al.: “Usability Analysis of a Pointing Gesture Interface”; Systems, Man and Cybernetic, 2004 IEEE International Conference on , Oct. 10, 2004, ISBN 978-0-7803-8566-5; see e.g. sections 2.1 and 4. |
Pavlovic, et al.: “Visual Interpretation of Hand Gestures for Human-Computer Interaction: A Review”; Jul. 1, 1997, IEEE Transactions on Pattern Analysis and Machine Intelligence, IEEE Service Center, Los Alamitos, CA, US LNKD-DOI: 10.1109/34.598226, pp. 677-695, XP000698168, ISSN: 0162-8828, pp. 14-16, figure 5. |
Plesniak, W et al.: “Spatial Interaction with Haptic Holograms”; Multimedia Computing and Systems, 1999, IEEE International Conference on Florence, Italy Jun. 7-11, 1999, Los Alamitos, CA USA, IEEE Comput. Soc. US, vol. 1, Jun. 7, 1999, pp. 413-426, XP010342817 ISBN: 0-7695-0253-9; Abstract, Figs. 7, 8. |
Rehg: “visual Analysis of High DOF Articulated Objects with Application to Hand Tracking”; [online] 1995, XP002585209, School of Computer Science Carnegie Mellon University, Retrieved from the internet: URL: http//www.dtoc/,o;/cgi-bin/GetTRDoc?AD=ADA306677&Location=U2&doc=GetRDoc.pdf> [retrieved on May 25, 2010], p. 1, 28, 31. |
Sato, Y et al.: “Real-Time Input of 3D Pose and Gestures of a User's Hand and Its Applications for HCI”; Proceedings IEEE 2001 virtual Reality. (VR). Yokohama, Japan, Mar. 13, 2001, pp. 79-86, XP010535487; ISBN: 0-7695-0948-7; Abstract, Figs. 3, 4, 6, 8. |
Search Report for corresponding GB 0715481.8, Date of Search: Nov. 27, 2007. |
Search Report for corresponding GB 0913330.7; Completed Nov. 3, 2009 by Dr. Russell Maurice. |
Sonka, et al.: “Image Processing, Analysis, and Machine Vision Second Edition”; Sep. 30, 1998, Thomson, XP002585208, ISBN: 053495393X, p. v-xii, p. 82-89. |
Sutcliffe, et al.: “Presence, Memory and Interaction in Virtual Environments”; International Journal of Human-Computer Studies, 62 (2005), pp. 307-327. |
Vámossy, et al.: “Virtual Hand—Hand Gesture Recognition System”, SISY 2007, 5th International Symposium on Intelligent Systems and Informatics, Aug. 24-25, 2007, Subolica, Serbia, IEEE, p. 97-102. |
Fiorentino, M et al.: “Spacedesign: A Mixed Reality Workspace for Aesthetic Industrial Design”; Mixed and Augmented Reality, 2002. ISMAR 2002. Proceedings. International Symposium on Sep. 30-Oct. 1, 2002, Piscataway, NJ, USA, IEEE, Sep. 30, 2002, pp. 86-318, XP010620945, ISBN: 0-7695-1781-1; Abstract, Figs. 1, 2; p. 86, left-hand col., ¶4; p. 87, left-hand col., ¶4-right-hand col. |
Nishihara et al: “Compound Gesture Recognition”, U.S. Appl. No. 12/196,767, filed Aug. 22, 2008. |
Ferren et al: “Gesture Recognition Interface System with a Light-Diffusive Screen”, U.S. Appl. No. 12/143,529, filed Jun. 20, 2008. |
DE Office Action for corresponding DE 10 2009 043 798.3, issued Nov. 10, 2010. |
Number | Date | Country | |
---|---|---|---|
20090103780 A1 | Apr 2009 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 12196767 | Aug 2008 | US |
Child | 12337225 | US | |
Parent | 11485788 | Jul 2006 | US |
Child | 12196767 | US |