The present invention relates generally to methods and systems for three-dimensional (3D) mapping, and specifically to processing of 3D map data.
A number of different methods and systems are known in the art for creating depth maps. In the present patent application and in the claims, the term “depth map” refers to a representation of a scene as a two-dimensional matrix of pixels, in which each pixel corresponds to a respective location in the scene and has a respective pixel depth value, indicative of the distance from a certain reference location to the respective scene location. In other words, the depth map has the form of an image in which the pixel values indicate topographical information, rather than brightness and/or color of the objects in the scene. Depth maps may be created, for example, by detection and processing of an image of an object onto which a pattern is projected, as described in U.S. Pat. No. 8,456,517, whose disclosure is incorporated herein by reference. The terms “depth map” and “3D map” are used herein interchangeably and have the same meaning.
Depth maps may be processed in order to segment and identify objects in the scene. Identification of humanoid forms (meaning 3D shapes whose structure resembles that of a human being) in a depth map, and changes in these forms from scene to scene, may be used as a means for controlling computer applications. For example, U.S. Pat. No. 8,249,334, whose disclosure is incorporated herein by reference, describes a computer-implemented method in which a depth map is segmented so as to find a contour of a humanoid body. The contour is processed in order to identify a torso and one or more limbs of the body. An input is generated to control an application program running on a computer by analyzing a disposition of at least one of the identified limbs in the depth map.
As another example, U.S. Pat. No. 8,565,479, whose disclosure is incorporated herein by reference, describes a method for processing a temporal sequence of depth maps of a scene containing a humanoid form. A digital processor processes at least one of the depth maps so as to find a location of the head of the humanoid form, and estimates dimensions of the humanoid form based on this location. The processor tracks movements of the humanoid form over the sequence using the estimated dimensions.
U.S. Pat. No. 9,047,507, whose disclosure is incorporated herein by reference, describes a method that includes receiving a depth map of a scene containing at least an upper body of a humanoid form. The depth map is processed so as to identify a head and at least one arm of the humanoid form in the depth map. Based on the identified head and at least one arm, and without reference to a lower body of the humanoid form, an upper-body pose, including at least three-dimensional (3D) coordinates of shoulder joints of the humanoid form, is extracted from the depth map.
Embodiments of the present invention provide methods, devices and software for extracting information from depth maps.
There is therefore provided, in accordance with an embodiment of the invention, a method for processing data, which includes receiving a depth map of a scene containing at least a humanoid head, the depth map comprising a matrix of pixels having respective pixel depth values. Using a digital processor, a curvature map of the scene is extracted from the depth map. The curvature map includes respective curvature values of at least some of the pixels in the matrix. The curvature values are processed in order to identify a face in the scene.
In some embodiments, processing the curvature values includes detecting one or more blobs in the curvature map over which the pixels have respective curvature values that are indicative of a convex surface, and identifying one of the blobs as the face. Typically, the curvature map includes respective curvature orientations of the at least some of the pixels, and identifying the one of the blobs includes calculating a roll angle of the face responsively to the curvature orientations of the pixels in the one of the blobs. In a disclosed embodiment, processing the curvature values includes applying a curvature filter to the curvature map in order to ascertain whether the one of the blobs is the face while correcting for the calculated roll angle.
Additionally or alternatively, processing the curvature values includes calculating a scale of the face responsively to a size of the one of the blobs, and applying a curvature filter to the curvature map in order to ascertain whether the one of the blobs is the face while correcting for the calculated scale.
Further additionally or alternatively, extracting the curvature map includes deriving a first curvature map from the depth map at a first resolution, and detecting the one or more blobs includes finding the one or more blobs in the first curvature map, and processing the curvature values includes deriving a second curvature map containing the one of the blobs at a second resolution, finer than the first resolution, and identifying the face using the second curvature map.
In some embodiments, processing the curvature values includes convolving the curvature map with a curvature filter kernel in order to find a location of the face in the scene. In a disclosed embodiment, convolving the curvature map includes separately applying a face filter kernel and a nose filter kernel in order to compute respective candidate locations of the face, and finding the location based on the candidate locations. Additionally or alternatively, convolving the curvature map includes computing a log likelihood value for each of a plurality of points in the scene, and choosing the location responsively to the log likelihood value.
There is also provided, in accordance with an embodiment of the invention, apparatus for processing data, including an imaging assembly, which is configured to capture a depth map of a scene containing at least a humanoid head, the depth map including a matrix of pixels having respective pixel depth values. A processor is configured to extract from the depth map a curvature map of the scene, the curvature map including respective curvature values of at least some of the pixels in the matrix, and to process the curvature values in order to identify a face in the scene.
There is additionally provided, in accordance with an embodiment of the invention, a computer software product, including a non-transitory computer-readable medium in which program instructions are stored, which instructions, when read by a computer, cause the computer to receive a depth map of a scene containing at least a humanoid head, the depth map including a matrix of pixels having respective pixel depth values, to extract from the depth map a curvature map of the scene, the curvature map including respective curvature values of at least some of the pixels in the matrix, and to process the curvature values in order to identify a face in the scene.
The present invention will be more fully understood from the following detailed description of the embodiments thereof, taken together with the drawings in which:
U.S. patent application Ser. No. 15/272,455, filed Sep. 22, 2016, whose disclosure is incorporated herein by reference, describes methods, systems and software for extracting humanoid forms from depth maps. In the disclosed methods, a digital processor extracts a curvature map from the depth map of a scene containing a humanoid form. The curvature map comprises respective oriented curvatures of at least some of the pixels in the depth map. In other words, at each of these pixels, the curvature map holds a scalar signed value indicating the dominant curvature value and the corresponding curvature orientation, i.e., the direction of the dominant curvature, expressed as a two-dimensional (2D) vector. The processor segments the depth map using both curvature values and orientations in the curvature map, and thus extracts 3D location and orientation coordinates of one or more limbs of the humanoid form.
The processor segments the depth map by identifying blobs in the curvature map over which the pixels have a positive curvature, meaning that the surfaces of these blobs are convex (although this definition of “positive” curvature is arbitrary, and curvature could alternatively be defined so that convex surfaces have negative curvature). The edges of the blobs are identified in the depth map at locations of sign changes in the curvature map. This use of curvature enhances the reliability and robustness of segmentation, since it enables the processor to distinguish between different blobs and between blobs and the background even when there is no marked change in depth at this edges of a given blob, as may occur when one body part occludes another, or when a body part is resting against a background surface or other object.
Embodiments of the present invention that are described herein process curvature maps specifically in order to identify one or more faces in the scene. Typically, in the disclosed methods, one or more blobs are detected in a curvature map as described above. The curvature orientations of the pixels in a blob that is a candidate to correspond to a face are processed in order to estimate the roll angle of the face. A curvature filter can then be applied to the curvature map while correcting for the calculated roll angle, in order to ascertain the likelihood that this blob is indeed a face. Additionally or alternatively, the size of the blob can be used to estimate and correct for the scale of the face.
Various sorts of classifiers can be used to extract faces from the curvature map. In some embodiments, which are described in greater detail hereinbelow, the curvature map is convolved with one or more curvature filter kernels in order to find the location of a face in the scene. In one embodiment, a face filter kernel and a nose filter kernel are applied separately in order to compute respective candidate locations, which are used in finding the actual face location. These filters are matched to the curvature features of a typical face (including the relatively high convex curvature of the nose), and are relatively insensitive to pitch and yaw of the face. The roll angle and scale can be normalized separately, as explained above. The filter can be configured to return a log likelihood value for each candidate point in the scene, whereby points having the highest log likelihood value can be identified as face locations.
In the example shown in
Imaging assembly 24 generates a data stream that includes depth maps for output to an image processor, such as a computer 26. Although computer 26 is shown in
The normal map is computed as follows: Taking u-v to be the surface parameterization grid of the depth map, p =p(u,v) represents the surface points of the depth map of
Computer 26 next computes a (low-resolution) curvature map, based on this normal map. The curvature computed for each pixel at this step can be represented in a 2×2 matrix form known in 3D geometry as the shape operator, S, which is defined as follows:
Computer 26 extracts the shape operator eigenvectors, corresponding to the two main curvature orientations, and the shape operator eigenvalues, corresponding to the curvature values along these orientations. The curvature map comprises the dominant curvature per pixel, i.e., the eigenvalue with the larger absolute value and the corresponding curvature orientation. The raw curvature value can be either positive or negative, with positive curvature corresponding to convex surface patches, and negative curvature corresponding to concave surface patches.
Computer 26 uses the curvature map in extracting blobs having positive curvature from the original depth map. Since body parts, such as the head and hand, are inherently convex, positive curvature within a blob of pixels is a necessary condition for the blob to correspond to such a body part. Furthermore, transitions from positive to negative curvature are good indicators of the edges of a body part, even when the body part is in contact with another object without a sharp depth gradation between the body part and the object.
Typically, computer 26 identifies the dominant curvature direction of a given blob as the statistical mode of the curvature directions of all the pixels. In other words, for each blob, the computer constructs a histogram of the curvature directions of the pixels in the blob, and identifies the dominant curvature direction as the mode of the histogram. If the histogram contains multi-modal behavior, each mode is analyzed independently, dividing the blob into multiple sub-blobs. On this basis, in the example shown in
Having identified the blob or blobs in the depth map that are candidates to be faces, computer 26 now proceeds to process the data from these blobs in the depth map in order to decide which, if any, can be confidently classified as faces. Assuming the first phase of depth map analysis, up to identification of the candidate blobs and their axes, was performed at low resolution, as explained above, computer 26 typically processes the data in the blobs during the second, classification phase at a finer resolution. Thus, for example,
Computer 26 next applies a face classifier to this curvature map. In the present embodiment, computer 26 convolves the curvature values of each blob that is to be classified with one or more filter kernels, which return a score for each pixel indicating the likelihood that it is the center point of a face. As part of this classification step, the roll angle of the face is normalized (to the vertical direction, for example) by rotating the axis derived from the curvature orientations of the pixels in the blob being classified. Additionally or alternatively, computer 26 normalizes the scale of the face based on the size of the blob. Equivalently, the filter kernel or kernels that are used in the classification may be rotated and/or scaled.
In addition to the nose region, additional face regions can be taken to generate a set of parts filters. This approach can be used in conjunction with a Deformable Parts Model (DPM), which performs object detection by combining match scores at both whole-object scale and object parts scale. The parts filters compensate for the deformation in the object part arrangement due to perspective changes.
Alternatively or additionally, other kernels may be used. For example, the kernels shown in
In the example shown in
In an alternative embodiment, the principles outlined above are implemented in a deep convolutional neural network (DCNN), rather than or in addition to using explicit filter kernels as in
Optionally, the blobs found on the basis of curvature (as in
It will be appreciated that the embodiments described above are cited by way of example, and that the present invention is not limited to what has been particularly shown and described hereinabove. Rather, the scope of the present invention includes both combinations and subcombinations of the various features described hereinabove, as well as variations and modifications thereof which would occur to persons skilled in the art upon reading the foregoing description and which are not disclosed in the prior art.
This application claims the benefit of U.S. Provisional Patent Application 62/396,839, filed Sep. 20, 2016, which is incorporated herein by reference.
Number | Name | Date | Kind |
---|---|---|---|
5081689 | Meyer et al. | Jan 1992 | A |
5673213 | Weigl | Sep 1997 | A |
5684887 | Lee et al. | Nov 1997 | A |
5846134 | Latypov | Dec 1998 | A |
5852672 | Lu | Dec 1998 | A |
5862256 | Zetts et al. | Jan 1999 | A |
5864635 | Zetts et al. | Jan 1999 | A |
5870196 | Lulli et al. | Feb 1999 | A |
6002808 | Freeman | Dec 1999 | A |
6137896 | Chang | Oct 2000 | A |
6176782 | Lyons et al. | Jan 2001 | B1 |
6256033 | Nguyen | Jul 2001 | B1 |
6518966 | Nakagawa et al. | Feb 2003 | B1 |
6608917 | Wei et al. | Aug 2003 | B1 |
6658136 | Brumitt | Dec 2003 | B1 |
6681031 | Cohen et al. | Jan 2004 | B2 |
6771818 | Krumm et al. | Aug 2004 | B1 |
6856314 | Ng | Feb 2005 | B2 |
6857746 | Dyner | Feb 2005 | B2 |
6993157 | Oue et al. | Jan 2006 | B1 |
7003134 | Covell et al. | Feb 2006 | B1 |
7003136 | Harville | Feb 2006 | B1 |
7013046 | Kawamura et al. | Mar 2006 | B2 |
7042440 | Pryor et al. | May 2006 | B2 |
7170492 | Bell | Jan 2007 | B2 |
7215815 | Honda | May 2007 | B2 |
7239726 | Li | Jul 2007 | B2 |
7259747 | Bell | Aug 2007 | B2 |
7302099 | Zhang et al. | Nov 2007 | B2 |
7317830 | Gordon et al. | Jan 2008 | B1 |
7340077 | Gokturk | Mar 2008 | B2 |
7348963 | Bell | Mar 2008 | B2 |
7428542 | Fink et al. | Sep 2008 | B1 |
7536032 | Bell | May 2009 | B2 |
7555158 | Park et al. | Jun 2009 | B2 |
7580572 | Bang et al. | Aug 2009 | B2 |
7583275 | Neumann et al. | Sep 2009 | B2 |
7602965 | Hong et al. | Oct 2009 | B2 |
7634133 | Jerebko et al. | Dec 2009 | B2 |
7706571 | Das et al. | Apr 2010 | B2 |
7925077 | Woodfill et al. | Apr 2011 | B2 |
7974443 | Kipman et al. | Jul 2011 | B2 |
8175374 | Pinault et al. | Aug 2012 | B2 |
8249334 | Berliner et al. | Aug 2012 | B2 |
8270688 | Fan et al. | Sep 2012 | B2 |
8280106 | Ma | Oct 2012 | B2 |
8280165 | Meng et al. | Oct 2012 | B2 |
8320621 | McEldowney | Nov 2012 | B2 |
8358342 | Park | Jan 2013 | B2 |
8379926 | Kanhere et al. | Feb 2013 | B2 |
8405656 | El Dokor et al. | Mar 2013 | B2 |
8411149 | Maison et al. | Apr 2013 | B2 |
8411932 | Liu et al. | Apr 2013 | B2 |
8433104 | Cheng | Apr 2013 | B2 |
8456517 | Spektor et al. | Jun 2013 | B2 |
8503720 | Shotton et al. | Aug 2013 | B2 |
8565479 | Gurman et al. | Oct 2013 | B2 |
8660318 | Komura et al. | Feb 2014 | B2 |
8660362 | Katz et al. | Feb 2014 | B2 |
8675933 | Wehnes et al. | Mar 2014 | B2 |
9002099 | Litvak et al. | Apr 2015 | B2 |
9019267 | Gurman | Apr 2015 | B2 |
9047507 | Gurman et al. | Jun 2015 | B2 |
9076205 | Cho | Jul 2015 | B2 |
9159140 | Hoof et al. | Oct 2015 | B2 |
9301722 | Martinson | Apr 2016 | B1 |
9311560 | Hoof et al. | Apr 2016 | B2 |
9317741 | Guigues et al. | Apr 2016 | B2 |
9390500 | Chang et al. | Jul 2016 | B1 |
9727776 | Dedhia | Aug 2017 | B2 |
9898651 | Gurman | Feb 2018 | B2 |
20020071607 | Kawamura et al. | Jun 2002 | A1 |
20030095698 | Kawano | May 2003 | A1 |
20030113018 | Nefian et al. | Jun 2003 | A1 |
20030147556 | Gargesha | Aug 2003 | A1 |
20030156756 | Gokturk et al. | Aug 2003 | A1 |
20030169906 | Gokturk | Sep 2003 | A1 |
20030235341 | Gokturk et al. | Dec 2003 | A1 |
20040091153 | Nakano et al. | May 2004 | A1 |
20040183775 | Bell | Sep 2004 | A1 |
20040184640 | Bang et al. | Sep 2004 | A1 |
20040184659 | Bang et al. | Sep 2004 | A1 |
20040258306 | Hashimoto | Dec 2004 | A1 |
20050031166 | Fujimura et al. | Feb 2005 | A1 |
20050088407 | Bell et al. | Apr 2005 | A1 |
20050089194 | Bell | Apr 2005 | A1 |
20050265583 | Covell et al. | Dec 2005 | A1 |
20050271279 | Fujimura et al. | Dec 2005 | A1 |
20060092138 | Kim et al. | May 2006 | A1 |
20060115155 | Lui et al. | Jun 2006 | A1 |
20060159344 | Shao et al. | Jul 2006 | A1 |
20060165282 | Berretty et al. | Jul 2006 | A1 |
20070003141 | Rittscher et al. | Jan 2007 | A1 |
20070076016 | Agarwala et al. | Apr 2007 | A1 |
20070154116 | Shieh | Jul 2007 | A1 |
20070188490 | Kanai et al. | Aug 2007 | A1 |
20070230789 | Chang et al. | Oct 2007 | A1 |
20080123940 | Kundu et al. | May 2008 | A1 |
20080226172 | Connell | Sep 2008 | A1 |
20080236902 | Imaizumi | Oct 2008 | A1 |
20080252596 | Bell et al. | Oct 2008 | A1 |
20080260250 | Vardi | Oct 2008 | A1 |
20080267458 | Laganiere et al. | Oct 2008 | A1 |
20080310706 | Asatani et al. | Dec 2008 | A1 |
20090009593 | Cameron et al. | Jan 2009 | A1 |
20090027335 | Ye | Jan 2009 | A1 |
20090035695 | Campestrini et al. | Feb 2009 | A1 |
20090078473 | Overgard et al. | Mar 2009 | A1 |
20090083622 | Chien et al. | Mar 2009 | A1 |
20090096783 | Shpunt et al. | Apr 2009 | A1 |
20090116728 | Agrawal et al. | May 2009 | A1 |
20090183125 | Magal et al. | Jul 2009 | A1 |
20090222388 | Hua et al. | Sep 2009 | A1 |
20090297028 | De Haan | Dec 2009 | A1 |
20100002936 | Khomo | Jan 2010 | A1 |
20100007717 | Spektor et al. | Jan 2010 | A1 |
20100034457 | Berliner et al. | Feb 2010 | A1 |
20100111370 | Black et al. | May 2010 | A1 |
20100235786 | Maizels et al. | Sep 2010 | A1 |
20100302138 | Poot et al. | Dec 2010 | A1 |
20100303289 | Polzin et al. | Dec 2010 | A1 |
20100322516 | Xu et al. | Dec 2010 | A1 |
20100322534 | Bolme | Dec 2010 | A1 |
20110025689 | Perez et al. | Feb 2011 | A1 |
20110052006 | Gurman et al. | Mar 2011 | A1 |
20110164032 | Shadmi et al. | Jul 2011 | A1 |
20110175984 | Tolstaya et al. | Jul 2011 | A1 |
20110182477 | Tamrakar et al. | Jul 2011 | A1 |
20110211754 | Litvak et al. | Sep 2011 | A1 |
20110237324 | Clavin et al. | Sep 2011 | A1 |
20110291926 | Gokturk et al. | Dec 2011 | A1 |
20110292036 | Sali et al. | Dec 2011 | A1 |
20110293137 | Gurman et al. | Dec 2011 | A1 |
20120070070 | Litvak | Mar 2012 | A1 |
20120087572 | Dedeoglu et al. | Apr 2012 | A1 |
20120162065 | Tossell et al. | Jun 2012 | A1 |
20120201431 | Komura | Aug 2012 | A1 |
20120269441 | Marchesotti | Oct 2012 | A1 |
20150227783 | Gurman et al. | Aug 2015 | A1 |
20150363655 | Artan | Dec 2015 | A1 |
20160042223 | Suh | Feb 2016 | A1 |
20160275337 | Shibutani | Sep 2016 | A1 |
20160292490 | Cheng | Oct 2016 | A1 |
Number | Date | Country |
---|---|---|
H03-029806 | Feb 1991 | JP |
H10-235584 | Sep 1998 | JP |
199935633 | Jul 1999 | WO |
2003071410 | Aug 2003 | WO |
2004107272 | Dec 2004 | WO |
2005003948 | Jan 2005 | WO |
2005094958 | Oct 2005 | WO |
2007043036 | Apr 2007 | WO |
2007078639 | Jul 2007 | WO |
2007105205 | Sep 2007 | WO |
2007132451 | Nov 2007 | WO |
2007135376 | Nov 2007 | WO |
2008120217 | Oct 2008 | WO |
2010004542 | Jan 2010 | WO |
Entry |
---|
Hart, D., U.S. Appl. No. 09/616,606 filed Jul. 14, 2000. |
Bleiweiss et al., “Fusing Time-of-Flight Depth and Color for Real-Time Segmentation and Tracking”, Editors R. Koch and A. Kolb: Dyn3D 2009, LNCS 5742, pp. 58-69, Springer-Verlag Berlin Heidelberg 2009. |
Gesturetek Inc., Consumer Electronics Solutions, “Gesture Control Solutions for Consumer Devices”, www.gesturetek.com, Toronto, Ontario, 1 page, Canada, 2009. |
Segen et al., “Human-computer interaction using gesture recognition and 3D hand tracking”, ICIP 98, Proceedings of the IEEE International Conference on Image Processing, vol. 3, pp. 188-192, Chicago, USA, Oct. 4-7, 1998. |
Avidan et al., “Trajectory triangulation: 3D reconstruction of moving points from amonocular image sequence”, PAMI, vol. 22, No. 4, pp. 348-357, Apr. 2000. |
Leclerc et al., “The direct computation of height from shading”, IEEE Conference on Computer Vision and Pattern Recognition, pp. 552-558, Jun. 3-7, 1991. |
Zhang et al., “Shape from intensity gradient”, IEEE Transactions on Systems, Man, and Cybernetics—Part A: Systems and Humans, vol. 29, No. 3, pp. 318-325, May 1999. |
Zhang et al., “Height recovery from intensity gradients”, IEEE Conference on Computer Vision and Pattern Recognition, pp. 508-513, Jun. 20-24, 1994. |
Horn, B., “Height and gradient from shading”, International Journal of Computer Vision , vol. 5, No. 1, pp. 37-75, Aug. 1990. |
Bruckstein, A., “On Shape from Shading”, Computer Vision, Graphics, and Image Processing Journal, vol. 44, Issue 2, pp. 139-154, Nov. 1988. |
Zhang et al., “Rapid Shape Acquisition Using Color Structured Light and Multi-Pass Dynamic Programming”, 1st International Symposium on 3D Data Processing Visualization and Transmission (3DPVT), Padova, Italy, 13 pages, Jun. 19-21, 2002. |
Besl, P., “Active Optical Range Imaging Sensors”, Journal Machine Vision and Applications, vol. 1, issue 2, pp. 127-152, Apr. 1988. |
Horn et al., “Toward optimal structured light patterns”, Proceedings of International Conference on Recent Advances in 3D Digital Imaging and Modeling, pp. 28-37, Ottawa, Canada, May 1997. |
Goodman, J.W., “Statistical Properties of Laser Speckle Patterns”, Laser Speckle and Related Phenomena, pp. 9-75, Springer-Verlag, Berlin Heidelberg, 1975. |
Asada et al., “Determining Surface Orientation by Projecting a Stripe Pattern”, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 10, No. 5, pp. 749-754, Sep. 1988. |
Winkelbach et al., “Shape from Single Stripe Pattern Illumination”, Luc Van Gool (Editor), (DAGM 2002) Patter Recognition, Lecture Notes in Computer Science 2449, p. 240-247, Springer 2002. |
Koninckx et al., “Efficient, Active 3D Acquisition, based on a Pattern-Specific Snake”, Luc Van Gool (Editor), (DAGM 2002) Pattern Recognition, Lecture Notes in Computer Science 2449, pp. 557-565, Springer 2002. |
Kimmel et al., Analyzing and synthesizing images by evolving curves with the Osher-Sethian method, International Journal of Computer Vision, vol. 24, issue 1, pp. 37-55, Aug. 1997. |
Zigelman et al., “Texture mapping using surface flattening via multi-dimensional scaling”, IEEE Transactions on Visualization and Computer Graphics, vol. 8, issue 2, pp. 198-207, Apr.-Jun. 2002. |
Dainty, J.C., “Introduction”, Laser Speckle and Related Phenomena, pp. 1-7, Springer-Verlag, Berlin Heidelberg, 1975. |
Mendlovic, et al., “Composite harmonic filters for scale, projection and shift invariant pattern recognition”, Applied Optics Journal, vol. 34, No. 2, pp. 310-316, Jan. 10, 1995. |
Fua et al., “Human Shape and Motion Recovery Using Animation Models”, 19th Congress, International Society for Photogrammetry and Remote Sensing, Amsterdam, The Netherlands, 16 pages, Jul. 2000. |
Allard et al., “Marker-less Real Time 3D modeling for Virtual Reality”, Immersive Projection Technology, Iowa State University, 8 pages, 2004. |
Howe et al., “Bayesian Reconstruction of 3D Human Motion from Single-Camera Video”, Advances in Neural Information Processing Systems 12, Denver, USA, 7 pages, 1999. |
Ascension Technology Corporation, “Flock of Birds: Real-Time Motion Tracking”, 2 pages, 2008. |
Grammalidis et al., “3-D Human Body Tracking from Depth Images Using Analysis by Synthesis”, Proceedings of the IEEE International Conference on Image Processing (ICIP2001), pp. 185-188, Thessaloniki, Greece, Oct. 1-10, 2001. |
Niesbat, S., “A System for Fast, Full-Text Entry for Small Electronic Devices”, Proceedings of the 5th International Conference on Multimodal Interfaces, ICMI 2003, Vancouver, Canada, 8 pages, Nov. 5-7, 2003. |
Softkinetic S.A., “3D Gesture Recognition Platform for Developers of 3D Applications”, Product Datasheet, IISU™, www.softkinetic-optrima.com, Belgium, 2 pages, 2007-2010. |
Li et al., “Real-Time 3D Motion Tracking with Known Geometric Models”, Real-Time Imaging Journal, vol. 5, pp. 167-187, Academic Press 1999. |
Segen et al., “Shadow gestures: 3D hand pose estimation using a single camera”, Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition, pp. 479-485, Fort Collins, USA, Jun. 23-25, 1999. |
Vogler et al., “ASL recognition based on a coupling between HMMs and 3D motion analysis”, Proceedings of IEEE International Conference on Computer Vision, pp. 363-369, Mumbai, India, Jan. 4-7, 1998. |
Gionis et al., “Similarity Search in High Dimensions via Hashing”, Proceedings of the 25th Very Large Database (VLDB) Conference, Edinburgh, UK, 12 pages, Sep. 7-10, 1999. |
Bleiweiss et al., “Markerless Motion Capture Using a Single Depth Sensor”, SIGGRAPH Asia 2009, Yokohama, Japan, 1 page, Dec. 16-19, 2009. |
Comaniciu et al., “Mean Shift: A Robust Approach Toward Feature Space Analysis”, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 24, No. 4, pp. 603-619, May 2002. |
Datar et al., “Locality-Sensitive Hashing Scheme Based on p-Stable Distributions”, Proceedings of the Symposium on Computational Geometry, pp. 253-262, Brooklyn, USA, Jun. 9-11, 2004. |
Dekker, L., “Building Symbolic Information for 3D Human Body Modeling from Range Data”, Proceedings of the Second International Conference on 3D Digital Imaging and Modeling, IEEE computer Society, pp. 388-397, Ottawa, Canada, Oct. 4-8, 1999. |
Holte et al., “Gesture Recognition using a Range Camera”, Technical Report, Laboratory of Computer Vision and Media Technology, Aalborg University, Denmark, 5 pages, Feb. 2007. |
Cheng et al., “Articulated Human Body Pose Inference from Voxel Data Using a Kinematically Constrained Gaussian Mixture Model”, CVPR EHuM2: 2nd Workshop on Evaluation of Articulated Human Motion and Pose Estimation, 11 pages, Jun. 2007. |
Nam et al., “Recognition of Hand Gestures with 3D, Nonlinear Arm Movements”, Pattern Recognition Letters, vol. 18, No. 1, pp. 105-113, Elsevier Science B.V. 1997. |
U.S. Appl. No. 14/697,661 Office Action dated Jun. 9, 2017. |
Ren et al., “Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks”, 14 pages, Jan. 6, 2016. |
Eshet et al., U.S. Appl. No. 15/272,455 filed Sep. 22, 2016. |
U.S. Appl. No. 15/272,455 office action dated Dec. 27, 2017. |
Ding et al., “Range Image Segmentation Using Principal Curvatures and Principal Directions”, 5th International Conference on Information Communications and Signal Processing, pp. 320-323, Dec. 2005. |
Deboeverie., “Curvature-based Human Body Parts Segmentation in Physiotherapy”, 10th International conference on computer vision theory and applications—VISAPP, pp. 630-637, Mar. 11-14, 2015. |
Primesense Inc., “Prime Sensor™ NITE 1.1 Framework Programmer's Guide”, Version 1.2, 34 pages, 2009. |
Luxand Inc., “Luxand FaceSDK 3.0 Face Detection and Recognition Library Developer's Guide”, 45 pages, years 2005-2010. |
Intel Corporation, “Open Source Computer Vision Library Reference Manual”, 377 pages, years 1999-2001. |
Arya et al., “An Optimal Algorithm for Approximate Nearest Neighbor Searching in Fixed Dimensions”, Association for Computing Machinery Journal, vol. 45, issue 6, pp. 891-923, New York, USA, Nov. 1998. |
Muja et al., “Fast Approximate Nearest Neighbors with Automatic Algorithm Configuration”, International Conference on Computer Vision Theory and Applications, pp. 331-340, Lisboa, Portugal, Feb. 5-8, 2009. |
Mori et al., “Estimating Human Body Configurations Using Shape Context Matching”, Proceedings of the European Conference on Computer Vision, vol. 3, pp. 666-680, Copenhagen, Denmark, May 27-Jun. 2, 2002. |
Agarwal et al., “Monocular Human Motion Capture with a Mixture of Regressors”, Proceedings of the 2004 IEEE Conference on Computer Vision and Pattern Recognition, San Diego, USA, 8 pages, Jun. 20-26, 2005. |
Lv et al., “Single View Human Action Recognition Using Key Pose Matching and Viterbi Path Searching”, Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Minneapolis, USA, 20 pages, Jun. 17-22, 2007. |
Munoz-Salinas et al., “People Detection and Tracking Using Stereo Vision and Color”, Image and Vision Computing, vol. 25, No. 6, pp. 995-1007, Jun. 1, 2007. |
Bradski, G., “Computer Vision Face Tracking for Use in a Perceptual User Interface”, Intel Technology Journal, 15 pages, vol. 2, issue 2 (2nd Quarter 2008). |
Kaewtrakulpong et al., “An Improved Adaptive Background Mixture Model for Real-Time Tracking with Shadow Detection”, Proceedings of the 2nd European Workshop on Advanced Video Based Surveillance Systems (AVBS'01), Kingston, UK, 5 pages, Sep. 2001. |
Kolsch et al., “Fast 2D Hand Tracking with Flocks of Features and Multi-Cue Integration”, IEEE Workshop on Real-Time Vision for Human Computer Interaction (at CVPR'04), Washington, USA, 8 pages, Jun. 27-Jul. 2, 2004. |
Shi et al., “Good Features to Track”, IEEE Conference on Computer Vision and Pattern Recognition, pp. 593-600, Seattle, USA, Jun. 21-23, 1994. |
Vosselman et al., “3D Building Model Reconstruction From Point Clouds and Ground Plans”, International Archives of Photogrammetry and Remote Sensing, vol. XXXIV-3/W4, pp. 37-43, Annapolis, USA, Oct. 22-24, 2001. |
Submuth et al., “Ridge Based Curve and Surface Reconstruction”, Eurographics Symposium on Geometry Processing, Barcelona, Spain, 9 pages, Jul. 4-6, 2007. |
Fergus et al., “Object Class Recognition by Unsupervised Scale-Invariant Learning”, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 264-271, Jun. 18-20, 2003. |
Cohen et al., “Interference of Human Postures by Classification of 3D Human Body Shape”, IEEE International Workshop on Analysis and Modeling of Faces and Gestures, ICCV 2003, Nice, France, 8 pages, Oct. 14-17, 2002. |
Agarwal et al., “3D Human Pose from Silhouettes by Relevance Vector Regression”, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 882-888, Jun. 27-Jul. 2, 2004. |
Borenstein et al., “Combining Top-down and Bottom-up Segmentation”, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 8 pages, Jun. 27-Jul. 2, 2004. |
Karlinsky et al., “Combined Model for Detecting, Localizing, Interpreting and Recognizing Faces”, Faces in Real-Life Images workshop, European Conference on Computer Vision, France, 14 pages, Oct. 12-18, 2008. |
Ullman, S., “Object Recognition and Segmentation by a Fragment-Based Hierarchy”, Trends in Cognitive Sciences, vol. 11, No. 2, pp. 58-64, Feb. 2007. |
Shakhnarovich et al., “Fast Pose Estimation with Parameter Sensitive Hashing”, Proceedings of the 9th IEEE International Conference on Computer Vision (ICCV 2003), pp. 750-759, Nice, France, Oct. 14-17, 2003. |
Ramanan et al., “Training Deformable Models for Localization”, Proceedings of the 2006 IEEE Conference on Computer Vision and Pattern Recognition, pp. 206-213, New York, USA, Jun. 17-22, 2006. |
Ramanan, D., “Learning to Parse Images of Articulated Bodies”, Neural Information Processing Systems Foundation 8 pages, year 2006. |
Jiang, H., “Human Pose Estimation Using Consistent Max-Covering”, 12th IEEE International Conference on Computer Vision, Kyoto, Japan, 8 pages, Sep. 27-Oct. 4, 2009. |
Shotton et al., “Real-Time Human Pose Recognition in Parts from Single Depth Images”, 24th IEEE Conference on Computer Vision and Pattern Recognition, Colorado Springs, USA, 8 pages, Jun. 20-25, 2011. |
Rodgers et al., “Object Pose Detection in Range Scan Data”, IEEE Conference on Computer Vision and Pattern Recognition, pp. 2445-2452, New York, USA, Jun. 17-22, 2006. |
Gordon et al., “Face recognition based on depth maps and surface curvature”, Proceedings of SPIE Geometric methods in Computer Vision, vol. 1570, pp. 234-247, Sep. 1, 1991. |
Kim et al., “Real-time normalization and feature extraction of 3D face data using curvature characteristics”, Proceedings 10th IEEE International Workshop on Robot and Human Interactive Communication, pp. 74-79, Sep. 18-21, 2001. |
Colombo et al., “3D face detection using curvature analysis”, Pattern Recognition, vol. 39, No. 3, pp. 444-455, Mar. 1, 2006. |
Alyuz et al., “Regional Registration for Expression Resistant 3-D Face Recognition”, IEEE Transactions on Information Forensics and Security, vol. 5, No. 3, pp. 425-440, Sep. 1, 2010. |
Lee et al., “Matching range images of human faces”, Proceedings of 3rd International Conference on Computer Vision, vol. 3, pp. 722-726, Dec. 4-7, 1990. |
Alyuz et al., “Robust 3D face recognition in the presence of realistic occlusions”, 5th IAPR International Conference on Biometrics (ICB), pp. 111-118, Mar. 29-Apr. 1, 2012. |
International Application # PCT/US2017/039172 search report dated Sep. 15, 2017. |
Ren et al., “Real-time modeling of 3-D soccer ball trajectories from multiple fixed cameras”, IEEE Transactions on Circuits and Systems for Video Technology, vol. 18, No. 3, pp. 350-362, Mar. 2008. |
Li et al., “Statistical modeling of complex backgrounds for foreground object detection”, IEEE Transactions on Image Processing, vol. 13, No. 11,pp. 1459-1472, Nov. 2004. |
Grzeszczuk et al., “Stereo based gesture recognition invariant for 3D pose and lighting”, Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, vol. 1, pp. 826-833, Jun. 13-15, 2000. |
Ess et al., “Improved multi-person tracking with active occlusion handling”, ICRA workshop of people Detection and tracking, pp. 1-6, 2009. |
Cucchiara et al., “Track-based and object-based occlusion for people tracking refinement indoor surveillance”, VSSN, pp. 1-7, 2004. |
Krumm et al., “Multi-camera multi person tracking for EasyLiving”., Visual surveillance, 2000, Proceedings, Third International workshop pp. 1-8, 2000. |
Yous et al., “People detection and tracking with World-Z map from single stereo camera”.,Visual surveillance, 2008, Eighth International workshop , pp. 1-8, 2008. |
Damen et al., “Detecting carried objects in short video sequences”, ECCV , School of computing, University of Leeds, pp. 1-14, 2008. |
Ran et al., “Multi moving people detection from binocular sequences”, Center for Automation Research Institute of Advanced Computer Studies, University of Maryland, pp. 1-4, 2003. |
Balcells et al., “An appearance-based approach for consistent labeling of humans and objects in video”, Pattern Analysis and Application, pp. 373-385, 2004. |
Number | Date | Country | |
---|---|---|---|
20180082109 A1 | Mar 2018 | US |
Number | Date | Country | |
---|---|---|---|
62396839 | Sep 2016 | US |