Plural detector time-of-flight depth mapping

Information

  • Patent Grant
  • 8803952
  • Patent Number
    8,803,952
  • Date Filed
    Monday, December 20, 2010
    14 years ago
  • Date Issued
    Tuesday, August 12, 2014
    10 years ago
Abstract
A depth-mapping method comprises exposing first and second detectors oriented along different optical axes to light dispersed from a scene, and furnishing an output responsive to a depth coordinate of a locus of the scene. The output increases with an increasing first amount of light received by the first detector during a first period, and decreases with an increasing second amount of light received by the second detector during a second period different than the first.
Description
BACKGROUND

Time-of-flight (TOF) depth mapping is a known approach for constructing a three-dimensional (3D) virtual model of a scene or subject. Encouraged by ever-improving digital-imaging technology and the availability of low-cost pulsed illumination, this approach is now used in applications ranging from aircraft navigation to robotics to video gaming, for example. Despite such broad applicability, the cost of conventional TOF depth mapping systems increases sharply with available depth resolution, particularly in the one-to-ten meter depth range. At these distances, the resolution may be affected by subject motion, and, by parallax error when non-optically aligned detectors are employed.


SUMMARY

One embodiment of this disclosure provides a depth-mapping method. The method comprises exposing first and second detectors oriented along different optical axes to light dispersed from a scene, and furnishing an output responsive to a depth coordinate of a locus of the scene. The output increases with an increasing first amount of light received by the first detector during a first period, and decreases with an increasing second amount of light received by the second detector during a second period different than the first.


The summary above is provided to introduce a selected part of this disclosure in simplified form, not to identify key or essential features. The claimed subject matter, defined by the claims, is limited neither to the content of this summary nor to implementations that address problems or disadvantages noted herein.





BRIEF DESCRIPTION OF THE DRAWINGS


FIG. 1 schematically shows an example environment for depth mapping in accordance with an embodiment of this disclosure.



FIG. 2 schematically shows an example vision-system detector in accordance with an embodiment of this disclosure.



FIG. 3 schematically shows an example vision system and a subject in accordance with an embodiment of this disclosure.



FIG. 4 illustrates an example temporal relationship between light pulses emitted and detected by a vision system in accordance with an embodiment of this disclosure.



FIG. 5 illustrates an example depth-mapping method in accordance with an embodiment of this disclosure.



FIG. 6 illustrates an example method for computing a depth map based on first image S and second image M, in accordance with an embodiment of this disclosure.



FIG. 7 illustrates example first and second images in accordance with an embodiment of this disclosure.



FIG. 8 illustrates an example method for enacting an iteration routine to improve depth-mapping accuracy, in accordance with an embodiment of this disclosure.





DETAILED DESCRIPTION

Aspects of this disclosure will now be described by example and with reference to the illustrated embodiments listed above. Components, process steps, and other elements that may be substantially the same in one or more embodiments are identified coordinately and are described with minimal repetition. It will be noted, however, that elements identified coordinately may also differ to some degree. It will be further noted that the drawing figures included herein are schematic and generally not drawn to scale. Rather, the various drawing scales, aspect ratios, and numbers of components shown in the figures may be purposely distorted to make certain features or relationships easier to see.



FIG. 1 shows an example environment in which depth mapping may be used to an advantage. The drawing shows subject 10 interacting with vision system 12. In the illustrated embodiment, the vision system is a component of a video-game system, and the subject is a video gamer. The vision system is configured to detect the positions, movements, and/or gestures of the subject and to furnish the same as input to a video-game application. The vision system is further configured to direct video output from the video-game application to display 14.


To provide a richer input, more suggestive of a virtual reality, vision system 12 is configured to detect and furnish the positions, movements, and/or gestures of the subject in three dimensions (3D). Such dimensions may correspond, for instance, to Cartesian coordinates X, Y, and Z. As described herein, 3D detection may be accomplished via depth mapping. Depth mapping associates a depth coordinate Z with a corresponding pixel (X, Y) in a plane image of a scene. This process maps a plurality of loci of the imaged scene in 3D, providing a depth coordinate for each locus of the imaged scene. The scene, as in the present example, may include a stationary or moving subject.


Although FIG. 1 and subsequent drawings illustrate depth mapping as applied to video gaming, other applications are contemplated as well, and are equally embraced by this disclosure. Such applications include control of non-game applications and operating systems, autonomous vehicle guidance, robotics, and range finding, among numerous other examples. In FIG. 1, vision system 12 is oriented opposite subject 10. The vision system and the subject may be separated by any suitable distance. The vision system, for example, may be two to four meters away from the subject.


Vision system 12 includes illumination source 16 and first detector 18. In the illustrated embodiment, both the illumination source and the first detector are coupled at the front face of the vision system, opposite subject 10.


Illumination source 16 is an intensity-modulated source configured to emit a train of narrow pulses of suitably intense light. This light, reflected from subject 10, is imaged by first detector 18. In some embodiments, the illumination source may pulse-modulate with a pulse-width of fifteen to twenty nanoseconds. In some embodiments, the illumination source may be configured to emit infrared (IR) or near-infrared (NIR) light. To this end, the illumination source may comprise a pulsed IR or NIR laser. In these and other embodiments, the illumination source may comprise one or more IR or NIR light-emitting diodes (LED's).


First detector 18 is configured inter alia to acquire a plane image of the scene that includes subject 10. FIG. 2 shows an embodiment of the first detector in schematic detail. The first detector includes lens 20, which focuses light from the scene through filter 22 and aperture 24, and onto detector array 26. The filter may be any suitable optical filter configured to limit the range of wavelengths and/or polarization states of the imaged light. It may comprise an interference filter, a color filter, and/or a polarizing filter. In this manner, the filter may reduce the degree to which ambient light interferes with vision system 12.


Detector array 26 may comprise any suitable ensemble of photosensitive elements—photodiode or charge-coupled device (CCD) elements, for example. The detector array is coupled to electronic shutter 28, which opens and closes at the command of controller 30. Accordingly, the image formed by the first detector may comprise a rectangular array of pixels. Controller 30 may be any suitable electronic control system of first detector 18 and/or vision system 12. When the electronic shutter is open, photon flux received in one or more of the photosensitive elements may be integrated as electric charge; when the electronic shutter is closed, the integration of the photon flux may be suspended. Accordingly, the electronic shutter may be commanded to open for a suitable period of time and close thereafter to accumulate a plane image of the scene or subject, or a portion thereof.


In some embodiments, controller 30 may be configured to synchronize the opening and closure of electronic shutter 28 to the pulse train from illumination source 16. In this way, it can be ensured that a suitable amount of reflected light from the illumination source reaches first detector 18 while electronic shutter 28 is open. Synchronization of the electronic shutter to the illumination source may enable other functionality as well, as described hereinafter.


Continuing in FIG. 2, controller 30 is configured to receive and process image data from detector array 26. The controller may receive other forms of input as well, and may be further configured to enact any computation, processing, or control function of vision system 12 or of the device in which the vision system is installed.


Depth mapping with vision system 12 will now be described with reference to FIGS. 3 and 4. FIG. 3 shows aspects of subject 10 and vision system 12 from above, while FIG. 4 illustrates a temporal relationship between light pulses emitted and detected by the vision system.


As shown in FIG. 3, some loci of subject 10 may be positioned relatively close to vision system 12, at a small value of depth coordinate Z. Other loci may be positioned relatively far from the vision system, at a large value of the depth coordinate. Solid line 32 in FIG. 4 shows an example profile of a light pulse emitted from illumination source 16. In some embodiments, the full-width at half-maximum (FWHM) of the emitted pulse may be fifteen to twenty nanoseconds (ns). The pulse from the illumination source illuminates substantially all loci of the subject, both near and far, then reflects back to detector 18. However, light reflected from a relatively close, shallow locus will be received and detected more promptly than light reflected from a farther, deeper locus. Accordingly, dashed line 34 in FIG. 4 shows an example response from first detector 18 on receiving light reflected from a shallow locus, two meters from the vision system. Dot-dashed line 36 in FIG. 4 shows an analogous response from the first detector on receiving light reflected from a deeper locus, four meters from the vision system. In general, the period of time between the illumination pulse and the detector pulse is proportional to the round-trip distance from the illumination source to the locus that reflects the light, and back to the detector. Therefore, by timing the arrival of the detector pulse corresponding to a given locus, the distance out to that locus may be computed. This summarizes the so-called time-of-flight (TOF) approach to depth mapping.


A convenient, indirect way to time the arrival of reflected light at a detector is to open an electronic shutter of the detector during a finite interval defined relative to the illumination pulse, and to integrate the flux of light received at the detector during that interval. To illustrate this approach, two intervals are marked in FIG. 4—a first interval S and an overlapping, second interval M of longer duration. The shutter may be open during the interval marked S. In this case, the integrated response of the detector will increase with increasing depth of the reflecting locus in the two-to-four meter depth range, and will reach a maximum when the depth is four meters.


This simple approach may be refined to compensate for differences in reflectivity among the various loci of the subject. In particular, the detector may be held open during a second, longer interval, such as the interval marked M in FIG. 4. The ratio of the integrated detector response during the interval S to the integrated response during the interval M may be computed and used as an indication of depth.


The ratiometric TOF approach outlined above admits of numerous variants, as the reader will appreciate. For example, two adjacent, non-overlapping intervals may be used instead of the overlapping intervals noted above. In general, normalizing a gated detector response via multiple discrete measurements corrects for inhomogeneous or anisotropic reflectivity of the subject. A plurality of measurements can be made sequentially, using a single detector, or concurrently, using multiple detectors. With multiple detectors, the plurality of measurements may be extracted from multiple (e.g., first and second) images of the same scene, formed from light of the same illumination pulse. Accordingly, FIG. 3 shows second detector 18′ coupled at the front face of vision system 12. The second detector, and the images formed therein, may be substantially the same as the first. As shown in the drawing, however, first detector 18 and second detector 18′ are oriented along different (i.e., non-collinear) optical axes due to their separation. In some embodiments, the first and second detectors may be separated by two to twenty centimeters, although virtually any spacing is within the scope of this disclosure.


Both sequential and concurrent detection approaches pose disadvantages that may limit depth resolution. A disadvantage of sequential measurements is that the subject may move or transform non-negligibly between successive measurements; a disadvantage of multiple detectors is loss of depth resolution due to parallax error. Parallax error may result when multiple detectors oriented along different optical axes are used to image the same scene or subject.


One way to avoid parallax error is to couple first and second detectors with suitable beam-splitting optics so that they share a common optical axis. This approach, however, presents additional disadvantages. First, the beam splitting optics may be expensive and require careful alignment, thereby increasing the production cost of the vision system. Second, any beam-splitting approach will make inefficient use of the available illumination flux and aperture area, for it distributes the same reflection among different detectors instead of allowing each detector to receive a full reflection.


To address these issues while providing still other advantages, this disclosure describes various depth-mapping methods. These methods are enabled by and described with continued reference to the above configurations. It will be understood, however, that the methods here described, and others fully within the scope of this disclosure, may be enabled by other configurations as well. The methods may be executed any time vision system 12 is operating, and may be executed repeatedly. Naturally, each execution of a method may change the entry conditions for a subsequent execution and thereby invoke complex decision-making logic. Such logic is fully contemplated in this disclosure.


Some of the process steps described and/or illustrated herein may, in some embodiments, be omitted without departing from the scope of this disclosure. Likewise, the indicated sequence of the process steps may not always be required to achieve the intended results, but is provided for ease of illustration and description. One or more of the illustrated actions, functions, or operations may be performed repeatedly, depending on the particular strategy being used.


The approaches described herein may be used to map scenes of a wide range of depths, and are not limited to the specific examples provided herein. They may be used, for example, in the one-to-ten meter depth range—viz., where a shallowest locus of the scene is more than one meter from the first detector, and a deepest locus of the scene is less than ten meters from the first detector. FIG. 5 illustrates an example depth-mapping method 38. The method begins by exposing first and second detectors oriented along different optical axes to light dispersed from a scene.


At 40, therefore, an illumination source (e.g., illumination source 16) emits an illumination pulse directed to a scene. The illumination pulse may be a narrow (e.g., fifteen to twenty nanoseconds) pulse from a laser or LED array, as described above. At 42 a first image S is acquired at the first detector. At 44 a second image M is acquired at the second detector. In some embodiments, steps 42 and 44 may be enacted concurrently; in another embodiment, they may be enacted sequentially—e.g., using two closely spaced, consecutive pulses of the illumination source. For efficient use of the available illumination power and aperture size, the first and second detectors may each comprise a complete detector array (e.g., detector array 26 as described above). In other embodiments, however, the first and second detectors may detect light in respective first and second regions of the same detector array. This may correspond, for example, to a case where the detector array is operated in a mode where the first and second regions sight roughly the same part of the scene. In one particular embodiment, the detector may be operated in an interlaced mode, where half of the lines detect S, and the other half detects M. At 46 a depth map is computed based on the first and second images, as further described below. From 46, method 38 returns.



FIG. 6 illustrates an example method 46 for computing a depth map based on first image S and second image M. At 48 the scene to be mapped is divided into N slices of depth Z1, . . . , ZI, . . . , ZN, as shown in FIG. 3. In FIG. 6, the scene is divided in mutually parallel slices normal to the optical axes of first detector 18 and second detector 18′. In other embodiments, the scene may be divided differently—in radial shells equidistant from either detector or any other point on the vision system, for example. The scene may be divided into any number of intervals of any suitable size, including equal size. In some embodiments, however, the scene may be divided into intervals sized equally in reciprocal space—viz.,







Z
l

=


1


1

Z
N


+


(


l
-
1


N
-
1


)



(


1

Z
1


-

1

Z
N



)




.





Returning now to FIG. 6, at 50 a pixel (U, V) of first image S is selected. Each pixel of the first image may be selected consecutively, by looping through the pixels of the first image. At 52 a depth slice ZI is selected. Each of the N depth slices may be selected consecutively, by looping through the series of depth slices defined above. At 54 pixel (U, V) of the first image is projected to coordinates (X, Y, ZI) via a geometric mapping function of the first detector. At 56 coordinates (X, Y, ZI) are collapsed to a pixel (U′, V′) of second image M via a geometric mapping function of the second detector, as illustrated in FIG. 7. With reference to the known distance between the first and second detectors, the geometric mapping functions may apply trigonometric relationships to project 2D coordinates from the first image to 3D coordinates and to collapse the 3D coordinates down to 2D coordinates of the second image. In this manner a series of candidate pixels of the second image are enumerated.


Returning again to FIG. 6, at 58 a depth measurement Z′I is computed via a time-of-flight computation based on pixel (U′I, V′I) of second image M and pixel (U, V) of first image S—viz.,

ZI′=fTOF[S(U,V),M(UI′,VI′)],

where S(U, V) and M(U′I, V′I) represent the integrated intensities of the selected pixels of the first and second images, respectively, and fTOF is a suitable TOF function. In this and other embodiments, the computed Z′I increases with an increasing first amount of light received by the first detector during a first period S, and decreases with an increasing second amount of light received by the second detector during a second period M. Here, the first amount of light is a brightness integrated at a first pixel of the first image, and the second amount of light is a brightness integrated at a second pixel of the second image.


In one example, fTOF may be linear in the ratio of the integrated intensities—i.e.,







Z
l


=


Z
1

+


(


Z
N

-

Z
1


)





S


(

U
,
V

)



M


(


U
l


,

V
l



)



.







Thus, the depth output may vary substantially linearly with a ratio of the first amount of light to the second amount of light.


At 60 the level of agreement AI between ZI and Z′I is assessed. The level of agreement may be quantified in any suitable manner. In one example,

AI=−|ZI−ZI′|.


In other examples, the level of agreement may be assessed differently. For example, the level of agreement may be assessed by measuring the distance between the pixel positions corresponding to the same locus in the two different detectors. Once the TOF depth is evaluated for a given slice based on first-detector mapping, one may collapse the projected locus down to a pixel position of the second detector. Here, AI may decrease with increasing distance between (U, V) and (U′, V′).


At 62 it is determined whether each depth slice has been selected. If each depth slice has not been selected, then the method returns to 52, where the next depth slice is selected. Otherwise, the method advances to 64. At 64 a depth slice J is found for which the computed agreement AJ is greatest. At 66 a depth value of Z′J is assigned to pixel (U, V) of first image S. In some embodiments, this depth value may be assigned instead to pixel (U′, V′) of second image M. In yet another embodiment, this same depth value may be assigned to the indicated pixels of both images. Thus, from the enumerated series of candidate pixels of the second image, one pixel is selected such that the computed TOF depth value indicates a depth of a locus most closely mappable to the first and second pixels.


In the illustrated embodiment, an iteration routine is invoked at 68 to improve the accuracy of the depth mapping. An example iteration routine is described below in the context of FIG. 8. In other embodiments, the iteration routine may be omitted.


Continuing in FIG. 6, at 70 it is determined whether each pixel in first image S has been selected. If each pixel of the first image has not been selected, then the method returns to 50. Otherwise, the method advances to 72. At 72 pixels with invalid depth mapping are flagged. In general, a depth output may be flagged as invalid when the locus most closely mappable to the first and second pixels is outside of a predefined range. In some embodiments, the depth mapping of a pixel may be flagged as invalid if the maximum agreement AJ computed at 64 is below a threshold value. In another embodiment, the depth mapping of a pixel may be flagged as invalid if an iteration routine (vide infra) fails to converge after a maximum number of iterations is reached. In still other embodiments, depth mapping invalidity may be assessed globally, by comparing computed depths from adjacent or nearby pixels in the first or second images. In particular, when the depth changes abruptly or discontinuously at a pixel, the depth of that pixel may be flagged as invalid. Thus, the predefined range of valid depth may be defined based in part on an indicated depth of a neighboring locus of the scene. From 72, the method returns.



FIG. 8 illustrates an example method 68 for enacting an iteration routine to improve the accuracy of the depth mapping procedure described above. At 74, ZI is replaced by Z′J. At 54 pixel (U, V) of first image S is projected to coordinates (X, Y, ZI) via a geometric mapping function of the first detector. At 56 coordinates (X, Y, ZI) are collapsed to a pixel (U′, V′) of second image M via a geometric mapping function of the second detector. At 58 a depth measurement Z′I is computed via a time-of-flight computation based on pixel (U′I, V′I) of second image M and pixel (U, V) of first image S. In this manner, a running depth value may be recomputed, using, as the second amount of light, a brightness integrated at the refined second pixel of the second image. At 76 it is determined whether ZI and Z′I differ by more than a threshold amount. If so, then method 68 advances to 78, where it is determined whether the maximum number of iterations have been reached. If the maximum number of iterations have not been reached, then the method advances to 80, where ZI is replaced by Z′I, and execution continues at 54. Thus, the actions of projecting, collapsing, and recomputing may be repeated for a finite number of iterations, or until the output has converged.


However, if the maximum number of iterations have been reached at 78, then the method advances to 82, where the computed depth mapping for pixel (U, V) of the first image is flagged as invalid. Thus, the depth output may be invalidated if the output does not converge in the finite number of iterations. From this point, or from 76 if it was determined that ZI and Z′I do not differ by more than the threshold amount, method 68 advances to 84. At 84, a depth value of Z′I is assigned to pixel (U, V) of first image 5, analogous to the assignment made at 66 of method 46. From 84, method 68 returns.


Although the foregoing methods are illustrated without reference to explicit alignment of the first and second images, such alignment may be enacted in various ways. For example, mapping a representative set of loci distributed over the scene would supply data that could be used to construct an appropriate function for mapping the pixels of the second image onto the first, or vice versa.


As noted above, the methods and functions described herein may be enacted via controller 30, shown schematically in FIG. 2. The illustrated controller includes logic subsystem 86 operatively coupled to memory subsystem 88. Memory subsystem 88 may hold instructions that cause logic subsystem 86 to enact the various methods. To this end, the logic subsystem may include one or more physical devices configured to execute instructions. For example, the logic subsystem may be configured to execute instructions that are part of one or more programs, routines, objects, components, data structures, or other logical constructs. Such instructions may be implemented to perform a task, implement a data type, transform the state of one or more devices, or otherwise arrive at a desired result. The logic subsystem may include one or more processors configured to execute software instructions. Additionally or alternatively, the logic subsystem may include one or more hardware or firmware logic machines configured to execute hardware or firmware instructions. The logic subsystem may optionally include components distributed among two or more devices, which may be remotely located in some embodiments.


Memory subsystem 88 may include one or more physical, non-transitory, devices configured to hold data and/or instructions executable by logic subsystem 86 to implement the methods and functions described herein. When such methods and functions are implemented, the state of the memory subsystem may be transformed (e.g., to hold different data). The memory subsystem may include removable media and/or built-in devices. The memory subsystem may include optical memory devices, semiconductor memory devices, and/or magnetic memory devices, among others. The memory subsystem may include devices with one or more of the following characteristics: volatile, nonvolatile, dynamic, static, read/write, read-only, random access, sequential access, location addressable, file addressable, and content addressable. In one embodiment, the logic subsystem and the memory subsystem may be integrated into one or more common devices, such as an application-specific integrated circuit (ASIC) or so-called system-on-a-chip. In another embodiment, the memory subsystem may include computer-system readable removable media, which may be used to store and/or transfer data and/or instructions executable to implement the herein-described methods and processes. Examples of such removable media include CD's, DVD's, HD-DVD's, Blu-Ray Discs, EEPROMs, and/or floppy disks, among others.


In contrast, in some embodiments aspects of the instructions described herein may be propagated in a transitory fashion by a pure signal—e.g., an electromagnetic signal, an optical signal, etc.—that is not held by a physical device for at least a finite duration. Furthermore, data and/or other forms of information pertaining to the present disclosure may be propagated by a pure signal.


The terms ‘module’ and ‘engine’ may be used to describe an aspect of controller 30 that is implemented to perform one or more particular functions. In some cases, such a module or engine may be instantiated via logic subsystem 86 executing instructions held by memory subsystem 88. It will be understood that different modules and/or engines may be instantiated from the same application, code block, object, routine, and/or function. Likewise, the same module and/or engine may be instantiated by different applications, code blocks, objects, routines, and/or functions in some cases.



FIG. 2 also shows controller 30 operatively coupled to the components of a user interface, which includes various input devices and output devices, such as display 14. Display 14 may provide a visual representation of data held by memory subsystem 88. As the herein-described methods and processes change the data held by the memory subsystem, and thus transform the state of the memory subsystem, the state of the display may likewise be transformed to visually represent changes in the underlying data. The display may include one or more display devices utilizing virtually any type of technology. Such display devices may be combined with logic subsystem 86 and/or memory subsystem 88 in a shared enclosure, or such display devices may be peripheral display devices.


Finally, it will be understood that the articles, systems, and methods described hereinabove are embodiments of this disclosure—non-limiting examples for which numerous variations and extensions are contemplated as well. Accordingly, this disclosure includes all novel and non-obvious combinations and sub-combinations of the articles, systems, and methods disclosed herein, as well as any and all equivalents thereof.

Claims
  • 1. A depth-mapping method comprising: directing modulated infrared or near-infrared illumination onto a scene;exposing first and second detectors oriented along different optical axes to light dispersed from the scene;for each of a series of candidate depths into the scene, mapping a pixel of the first detector to 3D coordinates having the candidate depth, reverse mapping the 3D coordinates to a candidate pixel of the second detector, and computing a depth based on an integrated response from the pixel of the first detector versus the candidate pixel of the second detector, the computed depth increasing with an increasing amount of light received at the pixel of the first detector during a first period and decreasing with an increasing amount of light received at the candidate pixel of the second detector during a second period;for each of the series of candidate depths, assessing agreement between the candidate depth and the corresponding computed depth; andfurnishing an output associated with the pixel of the first detector, the output based on the computed depth of closest agreement to the corresponding candidate depth.
  • 2. The method of claim 1, wherein the scene includes a plurality of loci, and wherein the output is one of a plurality of outputs corresponding to the plurality of loci, each output indicating a depth of its corresponding locus.
  • 3. The method of claim 1, wherein the first detector forms a first image of the scene and the second detector forms a second image of the scene, wherein the first and second images each comprise a rectangular array of pixels, and wherein the first and second periods overlap and differ in duration.
  • 4. The method of claim 1 further comprising: mapping the pixel of the first detector to 3D coordinates having the computed depth of closest agreement to the corresponding candidate depth;reverse mapping the 3D coordinates to a refined second pixel of the second detector;computing a refined depth based on an integrated response from the pixel of the first detector versus the refined second pixel of the second detector; andfurnishing a recomputed output associated with the pixel of the first detector, the recomputed output based on the refined depth.
  • 5. The method of claim 4 further comprising repeating said mapping, reverse mapping, and computing for a finite number of iterations, or until the recomputed output has converged.
  • 6. The method of claim 5 further comprising invalidating the recomputed output if the recomputed output does not converge in the finite number of iterations.
  • 7. The method of claim 1, wherein the output varies substantially linearly with a ratio of the first amount of light to the second amount of light.
  • 8. The method of claim 1 further comprising directing pulsed illumination onto the scene.
  • 9. The method of claim 1 further comprising invalidating the output when the computed depth of closest agreement to the corresponding candidate depth is outside of a predefined range.
  • 10. The method of claim 9, wherein the predefined range is defined based on a depth of a neighboring locus of the scene.
  • 11. The method of claim 3 further comprising aligning the second image to the first image.
  • 12. The method of claim 2, wherein a shallowest locus of the scene is more than one meter from the first detector, and a deepest locus of the scene is less than ten meters from the first detector.
  • 13. A vision system comprising: a modulated illumination source configured to illuminate a scene using modulated infrared or near-infrared illumination;first and second detectors oriented along different optical axes and arranged to detect light dispersed from the scene, such light including the modulated infrared or near- infrared illumination reflected back from the scene, with wavelength-responsiveness of the first and second detectors limited substantially to the modulated infrared or near-infrared illumination;a controller operatively coupled to the first and second detectors and to a source, and configured to: for each of a series of candidate depths into the scene, map a pixel of the first detector to 3D coordinates having the candidate depth, reverse map the 3D coordinates to a candidate pixel of the second detector, and compute a depth based on an integrated response from the pixel of the first detector versus the candidate pixel of the second detector, the computed depth increasing with an increasing amount of light received at the pixel of the first detector during a first period and decreasing with an increasing amount of light received at the candidate pixel of the second detector during a second period;for each of the series of candidate depths, assess agreement between the candidate depth and the corresponding computed depth; andfurnish an output associated with the pixel of the first detector, the output based on the computed depth of closest agreement to the corresponding candidate depth.
  • 14. The system of claim 13, wherein the source is pulse-modulated with a pulse-width of fifteen to twenty nanoseconds.
  • 15. The system of claim 13, wherein the source comprises one or more of an infrared or near-infrared light-emitting diode and a laser.
  • 16. The system of claim 13, wherein the first and second detectors are adjacent or separated by two to twenty centimeters.
  • 17. A depth-mapping method comprising: directing pulsed infrared or near-infrared illumination to a scene, the scene including a plurality of loci;exposing first and second detectors oriented along different optical axes to light dispersed from the scene, such light including the pulsed infrared or near-infrared illumination reflected back from the scene, with wavelength-responsiveness of the first and second detectors limited substantially to the modulated infrared or near-infrared illumination;forming a first image of the scene at a first detector and a second image of the scene at a second detector, the first and second images each comprising a rectangular array of pixels;for each of a series of candidate depths into the scene, mapping a pixel of the first detector to 3D coordinates having the candidate depth, reverse mapping the 3D coordinates to a candidate pixel of the second detector, and computing a depth based on an integrated response from the pixel of the first detector versus the candidate pixel of the second detector, the computed depth increasing with an increasing amount of light received at the pixel of the first detector during a first period and decreasing with an increasing amount of light received at the candidate pixel of the second detector during a second period;for each of the series of candidate depths, assessing agreement between the candidate depth and a corresponding computed depth; andfurnishing an output associated with the pixel of the first detector, the output based on the computed depth of closest agreement to the corresponding candidate depth.
  • 18. The method of claim 17, wherein the first and second images are formed at least partly from light of the same illumination pulse.
  • 19. The method of claim 1 wherein the light to which the first and second detectors are exposed includes the modulated infrared or near-infrared illumination reflected back from the scene, and wherein wavelength-responsiveness of the first and second detectors is limited substantially to the modulated infrared or near-infrared illumination.
  • 20. The method of claim 1 wherein the candidate pixel of the first detector is mapped to the 3D coordinates using a geometric mapping function of the first detector, and wherein the 3D coordinates are reverse mapped to define a candidate pixel of the second detector using a geometric mapping function of the second detector.
US Referenced Citations (188)
Number Name Date Kind
4627620 Yang Dec 1986 A
4630910 Ross et al. Dec 1986 A
4645458 Williams Feb 1987 A
4695953 Blair et al. Sep 1987 A
4702475 Elstein et al. Oct 1987 A
4711543 Blair et al. Dec 1987 A
4751642 Silva et al. Jun 1988 A
4796997 Svetkoff et al. Jan 1989 A
4809065 Harris et al. Feb 1989 A
4817950 Goo Apr 1989 A
4843568 Krueger et al. Jun 1989 A
4893183 Nayar Jan 1990 A
4901362 Terzian Feb 1990 A
4925189 Braeunig May 1990 A
5081530 Medina Jan 1992 A
5101444 Wilson et al. Mar 1992 A
5148154 MacKay et al. Sep 1992 A
5184295 Mann Feb 1993 A
5229754 Aoki et al. Jul 1993 A
5229756 Kosugi et al. Jul 1993 A
5239463 Blair et al. Aug 1993 A
5239464 Blair et al. Aug 1993 A
5288078 Capper et al. Feb 1994 A
5295491 Gevins Mar 1994 A
5320538 Baum Jun 1994 A
5347306 Nitta Sep 1994 A
5385519 Hsu et al. Jan 1995 A
5405152 Katanics et al. Apr 1995 A
5417210 Funda et al. May 1995 A
5423554 Davis Jun 1995 A
5454043 Freeman Sep 1995 A
5469740 French et al. Nov 1995 A
5495576 Ritchey Feb 1996 A
5516105 Eisenbrey et al. May 1996 A
5524637 Erickson et al. Jun 1996 A
5534917 MacDougall Jul 1996 A
5563988 Maes et al. Oct 1996 A
5577981 Jarvik Nov 1996 A
5580249 Jacobsen et al. Dec 1996 A
5594469 Freeman et al. Jan 1997 A
5597309 Riess Jan 1997 A
5616078 Oh Apr 1997 A
5617312 Iura et al. Apr 1997 A
5638300 Johnson Jun 1997 A
5641288 Zaenglein Jun 1997 A
5682196 Freeman Oct 1997 A
5682229 Wangler Oct 1997 A
5690582 Ulrich et al. Nov 1997 A
5703367 Hashimoto et al. Dec 1997 A
5704837 Iwasaki et al. Jan 1998 A
5715834 Bergamasco et al. Feb 1998 A
5875108 Hoffberg et al. Feb 1999 A
5877803 Wee et al. Mar 1999 A
5913727 Ahdoot Jun 1999 A
5933125 Fernie Aug 1999 A
5980256 Carmein Nov 1999 A
5989157 Walton Nov 1999 A
5995649 Marugame Nov 1999 A
6005548 Latypov et al. Dec 1999 A
6009210 Kang Dec 1999 A
6054991 Crane et al. Apr 2000 A
6066075 Poulton May 2000 A
6072494 Nguyen Jun 2000 A
6073489 French et al. Jun 2000 A
6077201 Cheng et al. Jun 2000 A
6098458 French et al. Aug 2000 A
6100896 Strohecker et al. Aug 2000 A
6101289 Kellner Aug 2000 A
6128003 Smith et al. Oct 2000 A
6130677 Kunz Oct 2000 A
6141463 Covell et al. Oct 2000 A
6147678 Kumar et al. Nov 2000 A
6152856 Studor et al. Nov 2000 A
6159100 Smith Dec 2000 A
6173066 Peurach et al. Jan 2001 B1
6181343 Lyons Jan 2001 B1
6188777 Darrell et al. Feb 2001 B1
6215890 Matsuo et al. Apr 2001 B1
6215898 Woodfill et al. Apr 2001 B1
6226396 Marugame May 2001 B1
6229913 Nayar et al. May 2001 B1
6256033 Nguyen Jul 2001 B1
6256400 Takata et al. Jul 2001 B1
6283860 Lyons et al. Sep 2001 B1
6289112 Jain et al. Sep 2001 B1
6299308 Voronka et al. Oct 2001 B1
6308565 French et al. Oct 2001 B1
6316934 Amorai-Moriya et al. Nov 2001 B1
6363160 Bradski et al. Mar 2002 B1
6384819 Hunter May 2002 B1
6411744 Edwards Jun 2002 B1
6430997 French et al. Aug 2002 B1
6476834 Doval et al. Nov 2002 B1
6496598 Harman Dec 2002 B1
6503195 Keller et al. Jan 2003 B1
6539931 Trajkovic et al. Apr 2003 B2
6570555 Prevost et al. May 2003 B1
6633294 Rosenthal et al. Oct 2003 B1
6640202 Dietz et al. Oct 2003 B1
6661918 Gordon et al. Dec 2003 B1
6681031 Cohen et al. Jan 2004 B2
6714665 Hanna et al. Mar 2004 B1
6731799 Sun et al. May 2004 B1
6738066 Nguyen May 2004 B1
6765726 French et al. Jul 2004 B2
6788809 Grzeszczuk et al. Sep 2004 B1
6801637 Voronka et al. Oct 2004 B2
6873723 Aucsmith et al. Mar 2005 B1
6876496 French et al. Apr 2005 B2
6937742 Roberts et al. Aug 2005 B2
6950534 Cohen et al. Sep 2005 B2
7003134 Covell et al. Feb 2006 B1
7036094 Cohen et al. Apr 2006 B1
7038855 French et al. May 2006 B2
7039676 Day et al. May 2006 B1
7042440 Pryor et al. May 2006 B2
7050606 Paul et al. May 2006 B2
7058204 Hildreth et al. Jun 2006 B2
7060957 Lange et al. Jun 2006 B2
7113918 Ahmad et al. Sep 2006 B1
7121946 Paul et al. Oct 2006 B2
7170492 Bell Jan 2007 B2
7184048 Hunter Feb 2007 B2
7202898 Braun et al. Apr 2007 B1
7222078 Abelow May 2007 B2
7227526 Hildreth et al. Jun 2007 B2
7259747 Bell Aug 2007 B2
7308112 Fujimura et al. Dec 2007 B2
7317836 Fujimura et al. Jan 2008 B2
7348963 Bell Mar 2008 B2
7359121 French et al. Apr 2008 B2
7367887 Watabe et al. May 2008 B2
7379563 Shamaie May 2008 B2
7379566 Hildreth May 2008 B2
7389591 Jaiswal et al. Jun 2008 B2
7412077 Li et al. Aug 2008 B2
7421093 Hildreth et al. Sep 2008 B2
7430312 Gu Sep 2008 B2
7436496 Kawahito Oct 2008 B2
7450736 Yang et al. Nov 2008 B2
7452275 Kuraishi Nov 2008 B2
7460690 Cohen et al. Dec 2008 B2
7489812 Fox et al. Feb 2009 B2
7536032 Bell May 2009 B2
7555142 Hildreth et al. Jun 2009 B2
7560701 Oggier et al. Jul 2009 B2
7570805 Gu Aug 2009 B2
7574020 Shamaie Aug 2009 B2
7576727 Bell Aug 2009 B2
7590262 Fujimura et al. Sep 2009 B2
7593552 Higaki et al. Sep 2009 B2
7598942 Underkoffler et al. Oct 2009 B2
7607509 Schmiz et al. Oct 2009 B2
7620202 Fujimura et al. Nov 2009 B2
7668340 Cohen et al. Feb 2010 B2
7680298 Roberts et al. Mar 2010 B2
7683954 Ichikawa et al. Mar 2010 B2
7684592 Paul et al. Mar 2010 B2
7701439 Hillis et al. Apr 2010 B2
7702130 Im et al. Apr 2010 B2
7704135 Harrison, Jr. Apr 2010 B2
7710391 Bell et al. May 2010 B2
7729530 Antonov et al. Jun 2010 B2
7746345 Hunter Jun 2010 B2
7760182 Ahmad et al. Jul 2010 B2
7809167 Bell Oct 2010 B2
7834846 Bell Nov 2010 B1
7852262 Namineni et al. Dec 2010 B2
RE42256 Edwards Mar 2011 E
7898522 Hildreth et al. Mar 2011 B2
8035612 Bell et al. Oct 2011 B2
8035614 Bell et al. Oct 2011 B2
8035624 Bell et al. Oct 2011 B2
8072470 Marks Dec 2011 B2
8090194 Golrdon et al. Jan 2012 B2
20040005092 Tomasi Jan 2004 A1
20060114333 Gokturk et al. Jun 2006 A1
20060221250 Rossbach et al. Oct 2006 A1
20070064976 England Mar 2007 A1
20080019470 Nam et al. Jan 2008 A1
20080026838 Dunstan et al. Jan 2008 A1
20080084486 Enge et al. Apr 2008 A1
20080246759 Summers Oct 2008 A1
20080249419 Sekins et al. Oct 2008 A1
20090128833 Yahav May 2009 A1
20100171813 Pelman et al. Jul 2010 A1
20100197400 Geiss Aug 2010 A1
20110007939 Teng et al. Jan 2011 A1
Foreign Referenced Citations (9)
Number Date Country
1575524 Feb 2005 CN
201254344 Jun 2010 CN
101866056 Oct 2010 CN
0583061 Feb 1994 EP
08044490 Feb 1996 JP
9310708 Jun 1993 WO
9717598 May 1997 WO
9944698 Sep 1999 WO
04001332 Dec 2003 WO
Non-Patent Literature Citations (37)
Entry
Whyte et al., “Multiple range imaging camera operation with minimal performance impact”, Retrieved at << http://researchcommons.waikato.ac.nz/bitstream/10289/3826/1/Multiple%20range%20imaging%20camera.pdf >>, 2010, pp. 10.
Gokturk et al., “A Time-Of-Flight Depth Sensor—System Description, Issues and Solutions”, Retrieved at << http://www.canesta.com/assets/pdf/technicalpapers/CVPR—Submission—TOF.pdf >>,Jan. 24, 2005, pp. 9.
Kim et al., “Multi-view Image and ToF Sensor Fusion for Dense 3D Reconstruction”, Retrieved at << http://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=05457430 >>, 2009, pp. 8.
Zhu et al., “Fusion of Time-of-Flight Depth and Stereo for High Accuracy Depth Maps”, Retrieved at << http://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=04587761>>, Aug. 5, 2008, pp. 8.
Blais et al., “Range Error Analysis of an Integrated Time-of-Flight, Triangulation, and Photogrammetric 3D Laser Scanning System”, Retrieved at << http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.28.2638&rep=rep1&type=pdf >>,Apr. 2000, pp. 14.
Stoppa et al., “A New Architecture for TOF-based Range-finding Sensor”, Retrieved at << http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=1426205 >>,2004, pp. 4.
Katz, Sagi et al., “Time-Of-Flight Depth Mapping,” U.S. Appl. No. 12/897,145, filed Oct. 4, 2010, 49 pages.
“Integrated Low Power Depth Camera and Projection Device,” U.S. Appl. No. 12/892,589, filed Sep. 28, 2010, 36 pages.
Lucas, Bruce D. et al., “An Iterative Image Registration Technique with an Application to Stereo Vision,” Proceedings of Imaging Understanding Workshop, pp. 121-130 (1981), 10 pages.
Kanade et al., “A Stereo Machine for Video-rate Dense Depth Mapping and Its New Applications”, IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 1996, pp. 196-202,The Robotics Institute, Carnegie Mellon University, Pittsburgh, PA.
Miyagawa et al., “CCD-Based Range Finding Sensor”, Oct. 1997, pp. 1648-1652, vol. 44 No. 10, IEEE Transactions on Electron Devices.
Rosenhahn et al., “Automatic Human Model Generation”, 2005, pp. 41-48, University of Auckland (CITR), New Zealand.
Aggarwal et al., “Human Motion Analysis: A Review”, IEEE Nonrigid and Articulated Motion Workshop, 1997, University of Texas at Austin, Austin, TX.
Shag et al., “An Open System Architecture for a Multimedia and Multimodal User Interface”, Aug. 24, 1998, Japanese Society for Rehabilitation of Persons with Disabilities (JSRPD), Japan.
Kohler, “Special Topics of Gesture Recognition Applied in Intelligent Home Environments”, In Proceedings of the Gesture Workshop, 1998, pp. 285-296, Germany.
Kohler, “Vision Based Remote Control in Intelligent Home Environments”, University of Erlangen-Nuremberg/Germany, 1996, pp. 147-154, Germany.
Kohler, “Technical Details and Ergonomical Aspects of Gesture Recognition applied in Intelligent Home Environments”, 1997, Germany.
Hasegawa et al., “Human-Scale Haptic Interaction with a Reactive Virtual Human in a Real-Time Physics Simulator”, Jul. 2006, vol. 4, No. 3, Article 6C, ACM Computers in Entertainment, New York, NY.
Qian et al., “A Gesture-Driven Multimodal Interactive Dance System”, Jun. 2004, pp. 1579-1582, IEEE International Conference on Multimedia and Expo (ICME), Taipei, Taiwan.
Zhao, “Dressed Human Modeling, Detection, and Parts Localization”, 2001, The Robotics Institute, Carnegie Mellon University, Pittsburgh, PA.
He, “Generation of Human Body Models”, Apr. 2005, University of Auckland, New Zealand.
Isard et al., “Condensation—Conditional Density Propagation for Visual Tracking”, 1998, pp. 5-28, International Journal of Computer Vision 29(1), Netherlands.
Livingston, “Vision-based Tracking with Dynamic Structured Light for Video See-through Augmented Reality”, 1998, University of North Carolina at Chapel Hill, North Carolina, USA.
Wren et al., “Pfinder: Real-Time Tracking of the Human Body”, MIT Media Laboratory Perceptual Computing Section Technical Report No. 353, Jul. 1997, vol. 19, No. 7, pp. 780-785, IEEE Transactions on Pattern Analysis and Machine Intelligence, Caimbridge, MA.
Breen et al., “Interactive Occlusion and Collusion of Real and Virtual Objects in Augmented Reality”, Technical Report ECRC-95-02, 1995, European Computer-Industry Research Center GmbH, Munich, Germany.
Freeman et al., “Television Control by Hand Gestures”, Dec. 1994, Mitsubishi Electric Research Laboratories, TR94-24, Caimbridge, MA.
Hongo et al., “Focus of Attention for Face and Hand Gesture Recognition Using Multiple Cameras”, Mar. 2000, pp. 156-161, 4th IEEE International Conference on Automatic Face and Gesture Recognition, Grenoble, France.
Pavlovic et al., “Visual Interpretation of Hand Gestures for Human-Computer Interaction: A Review”, Jul. 1997, pp. 677-695, vol. 19, No. 7, IEEE Transactions on Pattern Analysis and Machine Intelligence.
Azarbayejani et al., “Visually Controlled Graphics”, Jun. 1993, vol. 15, No. 6, IEEE Transactions on Pattern Analysis and Machine Intelligence.
Granieri et al., “Simulating Humans in VR”, The British Computer Society, Oct. 1994, Academic Press.
Brogan et al., “Dynamically Simulated Characters in Virtual Environments”, Sep./Oct. 1998, pp. 2-13, vol. 18, Issue 5, IEEE Computer Graphics and Applications.
Fisher et al., “Virtual Environment Display System”, ACM Workshop on Interactive 3D Graphics, Oct. 1986, Chapel Hill, NC.
“Virtual High Anxiety”, Tech Update, Aug. 1995, pp. 22.
Sheridan et al., “Virtual Reality Check”, Technology Review, Oct. 1993, pp. 22-28, vol. 96, No. 7.
Stevens, “Flights into Virtual Reality Treating Real World Disorders”, The Washington Post, Mar. 27, 1995, Science Psychology, 2 pages.
“Simulation and Training”, 1994, Division Incorporated.
State Intellectual Property Office of China, Office Action of Chinese Patent Application No. 201110428489.9, Nov. 20, 2013, 12 pages.
Related Publications (1)
Number Date Country
20120154542 A1 Jun 2012 US