The technical field generally relates to imaging technology, and more particularly, to lens position determination in depth imaging.
Traditional imaging techniques involve the projection of three-dimensional (3D) scenes onto two-dimensional (2D) planes, resulting in a loss of information, including a loss of depth information. This loss of information is a result of the nature of square-law detectors, such as charge-coupled devices (CCD) and complementary metal-oxide-semiconductor (CMOS) sensor arrays, which can only directly measure the time-averaged intensity of incident light. A variety of imaging techniques, both active and passive, have been developed that can provide 3D image information, including depth information. Non-limiting examples of 3D imaging techniques include, to name a few, stereoscopic and multiscopic imaging, time of flight, structured light, plenoptic and light field imaging, diffraction-grating-based imaging, and depth from focus or defocus. While each of these imaging techniques has certain advantages, each also has some drawbacks and limitations. Challenges therefore remain in the field of 3D imaging.
The present description generally relates to techniques for lens position determination in depth imaging.
In accordance with an aspect, there is provided a method of lens position determination in an imaging system including an imaging lens, an image sensor including an array of pixels, and an optical encoder having an angular response and interposed between the imaging lens and the image sensor, the method including:
In some embodiments, capturing the image data includes capturing the image data as a first set of pixel responses corresponding to a first set of pixels of the array of pixels of the image sensor and a second set of pixel responses corresponding to a second set of pixels of the array of pixels of the image sensor , the first set of pixel responses and the second set of pixel responses varying differently from each other as a function of angle of incidence in accordance with the angular response of the optical encoder; and generating the uniform-field image includes generating the uniform-field image as a plurality of image points, the generating including: computing a plurality of summed pixel responses based on a sum operation between the first set of pixel responses and the second set of pixel responses; computing a plurality of differential pixel responses based on a difference operation between the first set of pixel responses and the second set of pixel responses; and determining an intensity value of each image point of the uniform-field image as a ratio of a respective one of the plurality of differential pixel responses to a respective one of the plurality of summed pixel responses, the plurality of intensity values of the plurality of image points defining the intensity profile of the uniform-field image. In some embodiments, the pixel responses of the first set have magnitudes that increase as the angle of incidence increases, and where the pixel responses of the second set have magnitudes that decrease as the angle of incidence increases.
In some embodiments, the angle-dependent information encoded in the image data by the optical encoder includes a chief ray angle (CRA) function of the imaging lens over the array of pixels; a CRA shifting function of the optical encoder with respect to the array of pixels; and a range of angles of incidence within which the light incident from the scene reaches each pixel.
In some embodiments, determining the current lens position information about the imaging lens includes providing reference data relating an intensity profile of a reference uniform-field image to reference lens position information about the imaging lens; and determining the current lens position information from the intensity profile of the generated uniform-field image based on the reference data. In some embodiments, determining the current lens position information from the intensity profile of the uniform-field image based on the reference data includes: determining an intensity profile difference between the intensity profile of the generated uniform-field image and the intensity profile of the reference uniform-field image; determining lens position variation information from the intensity profile difference; and determining the current lens position information from the reference lens position information and the lens position variation information. In some embodiments, determining the lens position variation information from the intensity profile difference includes relating the intensity profile difference to a variation in a CRA function of the imaging lens; and determining the lens position variation information from the variation in the CRA function of the imaging lens using a model relating lens CRA function variations to changes in lens position. In some embodiments, the model relating lens CRA function variations to changes in lens position is established based on a nominal CRA function of the imaging lens defined at a nominal position of the imaging lens.
In some embodiments, determining the current lens position information includes determining an axial position of the imaging lens along an optical axis of the imaging lens. In some embodiments, determining the current lens position information includes determining a first lateral position of the imaging lens along a first lateral direction perpendicular to the optical axis of the imaging lens; and determining a second lateral position of the imaging lens along a second lateral direction perpendicular to both the optical axis of the imaging lens and the first lateral direction. In some embodiments, determining the current lens position information includes determining a first tilt angle of the imaging lens relative to the first lateral direction; and determining a second tilt angle of the imaging lens relative to the second lateral direction.
In some embodiments, the scene is representative of a uniform field; the captured image data includes at least one image of the scene; and the uniform-field image is generated from the at least one image of the scene without performing a prior step of removing depth cues from the at least one image of the scene. In some embodiments, the at least one image of the scene is a single image of the scene.
In some embodiments, the scene is not representative of a uniform field; the captured image data includes one or more images of the scene; and the method includes removing depth cues from the one or more images of the scene, combining the one or more images of the scene with removed depth cues into a fused image of the scene, and generating the uniform-field image from the fused image of the scene. In some embodiments, a number of the one or more images of the scene ranges between 3 and 300.
In some embodiments, the optical encoder includes a transmissive diffraction mask (TDM), the TDM being configured to diffract the light incident from the scene having passed through the imaging lens to generate diffracted light, the diffracted light having the angle-dependent information encoded in its intensity distribution for detection by the image sensor as the captured image data. In some embodiments, the TDM includes a binary phase grating including a series of alternating ridges and grooves extending along a grating axis at a grating period. In some embodiments, the image sensor has a pixel pitch along the grating axis, the pixel pitch being half of the grating period.
In some embodiments, the TDM includes a first set of diffraction gratings having a first grating axis orientation and a second set of diffraction gratings having a second grating axis orientation, the first grating axis orientation being perpendicular to the second grating axis orientation; generating the uniform-field image includes generating a first portion of the uniform-field image from a first portion of the captured image data having angle-dependent information encoded therein by the first set of diffraction gratings; and generating a second portion of the uniform-field image from a second portion of the captured image data having angle-dependent information encoded therein by the second set of diffraction gratings; and determining the current lens position information includes: determining first lens position information from a first intensity profile of the first portion of the uniform-field image; determining second lens position information from a second intensity profile of the second portion of the uniform-field image; and determining the current lens position information from the first lens position information and the second lens position information.
In some embodiments, the optical encoder includes an array of microlenses, each microlens covering at least two pixels of the image sensor. In some embodiments, each microlens covers two pixels of the image sensor. In some embodiments, each microlens covers four pixels of the image sensor, the four pixels being arranged in a 2×2 cell.
In some embodiments, the image sensor includes a color filter array interposed between the optical encoder and the array of pixels.
In accordance with another aspect, there is provided a method of focus distance adjustment in an imaging system including an imaging lens, an image sensor including an array of pixels, and an optical encoder having an angular response and interposed between the imaging lens and the image sensor, the method including:
In some embodiments, determining the target lens-to-sensor distance corresponding to the target focus distance includes computing the target lens-to-sensor distance from the target focus distance and a focal length of the imaging lens according to zs,target=[(1/f)−(1/zf,target)]−1, where zs,target is the target lens-to-sensor distance, f is the focal length of the imaging lens, and zf,target is an the target focus distance.
In accordance with another aspect, there is provided a non-transitory computer readable storage medium having stored thereon computer readable instructions that, when executed by a processor, cause the processor to perform a method of lens position determination in an imaging system including an imaging lens, an image sensor including an array of pixels, and an optical encoder having an angular response and interposed between the imaging lens and the image sensor, the method including:
In some embodiments, the captured image data includes a first set of pixel responses corresponding to a first set of pixels of the array of pixels of the image sensor and a second set of pixel responses corresponding to a second set of pixels of the array of pixels of the image sensor, the first set of pixel responses and the second set of pixel responses varying differently from each other as a function of angle of incidence in accordance with the angular response of the optical encoder; and generating the uniform-field image includes generating the uniform-field image as a plurality of image points, the generating including: computing a plurality of summed pixel responses based on a sum operation between the first set of pixel responses and the second set of pixel responses; computing a plurality of differential pixel responses based on a difference operation between the first set of pixel responses and the second set of pixel responses; and determining an intensity value of each image point of the uniform-field image as a ratio of a respective one of the plurality of differential pixel responses to a respective one of the plurality of summed pixel responses, the plurality of intensity values of the plurality of image points defining the intensity profile of the uniform-field image.
In some embodiments, determining the current lens position information about the imaging lens includes providing reference data relating an intensity profile of a reference uniform-field image to reference lens position information about the imaging lens; and determining the current lens position information from the intensity profile of the generated uniform-field image based on the reference data.
In some embodiments, determining the current lens position information from the intensity profile of the uniform-field image based on the reference data includes: determining an intensity profile difference between the intensity profile of the generated uniform-field image and the intensity profile of the reference uniform-field image; determining lens position variation information from the intensity profile difference; and determining the current lens position information from the reference lens position information and the lens position variation information.
In some embodiments, determining the lens position variation information from the intensity profile difference includes relating the intensity profile difference to a variation in a CRA function of the imaging lens; and determining the lens position variation information from the variation in the CRA function of the imaging lens using a model relating lens CRA function variations to changes in lens position.
In some embodiments, determining the current lens position information includes determining an axial position of the imaging lens along an optical axis of the imaging lens. In some embodiments, determining the current lens position information includes determining a first lateral position of the imaging lens along a first lateral direction perpendicular to the optical axis of the imaging lens; and determining a second lateral position of the imaging lens along a second lateral direction perpendicular to both the optical axis of the imaging lens and the first lateral direction. In some embodiments, determining the current lens position information includes determining a first tilt angle of the imaging lens relative to the first lateral direction; and determining a second tilt angle of the imaging lens relative to the second lateral direction.
In some embodiments, the scene is representative of a uniform field; the captured image data includes at least one image of the scene; and the uniform-field image is generated from the at least one image of the scene without performing a prior step of removing depth cues from the at least one image of the scene.
In some embodiments, the scene is not representative of a uniform field; the captured image data includes one or more images of the scene; and the method includes removing depth cues from the one or more images of the scene, combining the one or more images of the scene with removed depth cues into a fused image of the scene, and generating the uniform-field image from the fused image of the scene.
In some embodiments, the optical encoder includes a transmissive diffraction mask (TDM), the TDM being configured to diffract the light incident from the scene having passed through the imaging lens to generate diffracted light, the diffracted light having the angle-dependent information encoded therein for detection by the image sensor as the captured image data. In some embodiments, the TDM includes a binary phase grating including a series of alternating ridges and grooves extending along a grating axis at a grating period. In some embodiments, the image sensor has a pixel pitch along the grating axis, the pixel pitch being half of the grating period.
In some embodiments, the TDM includes a first set of diffraction gratings having a first grating axis orientation and a second set of diffraction gratings having a second grating axis orientation, the first grating axis orientation being perpendicular to the second grating axis orientation; generating the uniform-field image includes generating a first portion of the uniform-field image from a first portion of the captured image data having angle-dependent information encoded therein by the first set of diffraction gratings; and generating a second portion of the uniform-field image from a second portion of the captured image data having angle-dependent information encoded therein by the second set of diffraction gratings; and determining the current lens position information includes: determining first lens position information from a first intensity profile of the first portion of the uniform-field image; determining second lens position information from a second intensity profile of the second portion of the uniform-field image; and determining the current lens position information from the first lens position information and the second lens position information.
In some embodiments, the optical encoder includes an array of microlenses, each microlens covering at least two pixels of the image sensor. In some embodiments, each microlens covers two pixels of the image sensor. In some embodiments, each microlens covers four pixels of the image sensor, the four pixels being arranged in a 2×2 cell.
In accordance with another, there is provided a computer device including a processor and a non-transitory computer readable storage medium such as described herein, the non-transitory computer readable storage medium being operatively coupled to the processor.
In accordance with another aspect, there is provided an imaging system having lens position determination capabilities, the imaging system including:
In some embodiments, the image sensor is configured to capture the image data as a first set of pixel responses corresponding to a first set of pixels of the array of pixels and a second set of pixel responses corresponding to a second set of pixels of the array of pixels of the image sensor, the first set of pixel responses and the second set of pixel responses varying differently from each other as a function of angle of incidence in accordance with the angular response of the optical encoder; and generating the uniform-field image includes generating the uniform-field image as a plurality of image points, the generating including: computing a plurality of summed pixel responses based on a sum operation between the first set of pixel responses and the second set of pixel responses; computing a plurality of differential pixel responses based on a difference operation between the first set of pixel responses and the second set of pixel responses; and determining an intensity value of each image point of the uniform-field image as a ratio of a respective one of the plurality of differential pixel responses to a respective one of the plurality of summed pixel responses, the plurality of intensity values of the plurality of image points defining the intensity profile of the uniform-field image.
In some embodiments, determining the current lens position information about the imaging lens includes providing reference data relating an intensity profile of a reference uniform-field image to reference lens position information about the imaging lens; and determining the current lens position information from the intensity profile of the generated uniform-field image based on the reference data. In some embodiments, determining the current lens position information from the intensity profile of the uniform-field image based on the reference data includes: determining an intensity profile difference between the intensity profile of the generated uniform-field image and the intensity profile of the reference uniform-field image; determining lens position variation information from the intensity profile difference; and determining the current lens position information from the reference lens position information and the lens position variation information. In some embodiments, determining the lens position variation information from the intensity profile difference includes relating the intensity profile difference to a variation in a CRA function of the imaging lens; and determining the lens position variation information from the variation in the CRA function of the imaging lens using a model relating lens CRA function variations to changes in lens position.
In some embodiments, determining the current lens position information includes determining an axial position of the imaging lens along an optical axis of the imaging lens, a first lateral position of the imaging lens along a first lateral direction perpendicular to the optical axis of the imaging lens, a second lateral position of the imaging lens along a second lateral direction perpendicular to both the optical axis of the imaging lens and the first lateral direction, a first tilt angle of the imaging lens relative to the first lateral direction, a second tilt angle of the imaging lens relative to the second lateral direction, or any combination thereof.
In some embodiments, the optical encoder includes a transmissive diffraction mask (TDM), the TDM being configured to diffract the light incident from the scene having passed through the imaging lens to generate diffracted light, the diffracted light having the angle-dependent information encoded therein for detection by the image sensor as the captured image data. In some embodiments, the TDM includes a binary phase grating including a series of alternating ridges and grooves extending along a grating axis at a grating period. In some embodiments, the image sensor has a pixel pitch along the grating axis, the pixel pitch being half of the grating period.
In some embodiments, the TDM includes a first set of diffraction gratings having a first grating axis orientation and a second set of diffraction gratings having a second grating axis orientation, the first grating axis orientation being perpendicular to the second grating axis orientation; generating the uniform-field image includes generating a first portion of the uniform-field image from a first portion of the captured image data having angle-dependent information encoded therein by the first set of diffraction gratings; and generating a second portion of the uniform-field image from a second portion of the captured image data having angle-dependent information encoded therein by the second set of diffraction gratings; and determining the current lens position information includes: determining first lens position information from a first intensity profile of the first portion of the uniform-field image; determining second lens position information from a second intensity profile of the second portion of the uniform-field image; and determining the current lens position information from the first lens position information and the second lens position information. In some embodiments, the diffraction gratings of the first set and the diffraction gratings of the second set each include a binary phase grating including a series of alternating ridges and grooves extending along a grating axis at a grating period.
In some embodiments, the optical encoder includes an array of microlenses, each microlens covering at least two pixels of the image sensor. In some embodiments, each microlens covers two pixels of the image sensor. In some embodiments, each microlens covers four pixels of the image sensor, the four pixels being arranged in a 2×2 cell.
In some embodiments, the image sensor includes a color filter array interposed between the optical encoder and the array of pixels.
In some embodiments, the computer device is operatively coupled to the imaging lens, and wherein the operations performed by the processor further include: providing a target focus distance at which to set the imaging system; determining a target lens-to-sensor distance between the imaging lens and the image sensor corresponding to the target focus distance; and performing a lens position adjustment operation including one or more iterative cycles, each iterative cycle including; controlling the imaging lens to move with respect to the image sensor based on the target lens-to-sensor distance; controlling the image sensor to capture image data from the scene; performing the operations of receiving the captured image data, generating a uniform-field image from the captured image data, and determining current lens position information, the current lens position information including a current lens-to-sensor distance between the imaging lens and the image sensor; determining whether there is a match between the current lens-to-sensor distance and the target lens-to-sensor distance; if there is a match between the current lens-to-sensor distance and the target lens-to-sensor distance, terminating the lens position adjustment operation and determining that the imaging system has been set at the target focus distance; and if there is not a match between the current lens-to-sensor distance and the target lens-to-sensor distance, performing another iterative cycle.
In accordance with another aspect, there is provided a method of lens position determination in a depth imaging system including an imaging lens, an image sensor, and a transmissive diffractive mask (TDM) interposed between the imaging lens and the image sensor, the method including:
In accordance with another aspect, there is provided a non-transitory computer readable storage medium having stored thereon computer readable instructions that, when executed by a processor, cause the processor to perform a method of lens position determination in a depth imaging system including an imaging lens, an image sensor, and a transmissive diffractive mask (TDM) interposed between the imaging lens and the image sensor, the method including:
In accordance with another aspect, there is provided a depth imaging system including: an imaging lens;
In some embodiments, the lens position information may be used to obtain absolute depth information from relative depth information measured by the depth imaging system.
In some embodiments, the present techniques may be used with an optical encoder different from a TDM to encode angle-dependent information in the light transmitted through the imaging lens prior to detection by the photosensitive pixels. For example, in some embodiments, the optical encoder may be a microlens array interposed between the imaging lens and the image sensor, wherein each microlens of the microlens array may cover two or more photosensitive pixels of the image sensor, and wherein the photosensitive pixels may be configured to operate as phase detection pixels.
Other method and process steps may be performed prior, during or after the steps described herein. The order of one or more steps may also differ, and some of the steps may be omitted, repeated, and/or combined, as the case may be. It is also to be noted that some steps may be performed using various analysis and processing techniques, which may be implemented in hardware, software, firmware, or any combination thereof.
Other objects, features, and advantages of the present description will become more apparent upon reading of the following non-restrictive description of specific embodiments thereof, given by way of example only with reference to the appended drawings. Although specific features described in the above summary and in the detailed description below may be described with respect to specific embodiments or aspects, it should be noted that these specific features may be combined with one another unless stated otherwise.
In the present description, similar features in the drawings have been given similar reference numerals. To avoid cluttering certain figures, some elements may not be indicated if they were already identified in a preceding figure. It should also be understood that the elements of the drawings are not necessarily depicted to scale, since emphasis is placed on clearly illustrating the elements and structures of the present embodiments. Furthermore, positional descriptors indicating the location and/or orientation of one element with respect to another element are used herein for ease and clarity of description. Unless otherwise indicated, these positional descriptors should be taken in the context of the figures and should not be considered limiting. Such spatially relative terms are intended to encompass different orientations in the use or operation of the present embodiments, in addition to the orientations exemplified in the figures. Furthermore, when a first element is referred to as being “on”, “above”, “below”, “over”, or “under” a second element, the first element can be either directly or indirectly on, above, below, over, or under the second element, respectively, such that one or multiple intervening elements may be disposed between the first element and the second element.
The terms “a”, “an”, and “one” are defined herein to mean “at least one”, that is, these terms do not exclude a plural number of elements, unless stated otherwise.
The term “or” is defined herein to mean “and/or”, unless stated otherwise.
Terms such as “substantially”, “generally”, and “about”, which modify a value, condition, or characteristic of a feature of an exemplary embodiment, should be understood to mean that the value, condition, or characteristic is defined within tolerances that are acceptable for the proper operation of this exemplary embodiment for its intended application or that fall within an acceptable range of experimental error. In particular, the term “about” generally refers to a range of numbers that one skilled in the art would consider equivalent to the stated value (e.g., having the same or an equivalent function or result). In some instances, the term “about” means a variation of ±10% of the stated value. It is noted that all numeric values used herein are assumed to be modified by the term “about”, unless stated otherwise. The term “between” as used herein to refer to a range of numbers or values defined by endpoints is intended to include both endpoints, unless stated otherwise.
The term “based on” as used herein is intended to mean “based at least in part on”, whether directly or indirectly, and to encompass both “based solely on” and “based partly on”. In particular, the term “based on” may also be understood as meaning “depending on”, “representative of”, “indicative of”, “associated with”, “relating to”, and the like.
The terms “match”, “matching”, and “matched” refer herein to a condition in which two elements are either the same or within some predetermined tolerance of each other. That is, these terms are meant to encompass not only “exactly” or “identically” matching the two elements, but also “substantially”, “approximately”, or “subjectively” matching the two elements, as well as providing a higher or best match among a plurality of matching possibilities.
The terms “connected” and “coupled”, and derivatives and variants thereof, refer herein to any connection or coupling, either direct or indirect, between two or more elements, unless stated otherwise. For example, the connection or coupling between elements may be mechanical, optical, electrical, magnetic, thermal, chemical, logical, fluidic, operational, or any combination thereof.
The term “concurrently” refers herein to two or more processes that occur during coincident or overlapping time periods. The term “concurrently” does not necessarily imply complete synchronicity and encompasses various scenarios including time-coincident or simultaneous occurrence of two processes; occurrence of a first process that both begins and ends during the duration of a second process; and occurrence of a first process that begins during the duration of a second process, but ends after the completion of the second process.
The terms “light” and “optical”, and variants and derivatives thereof, refer herein to radiation in any appropriate region of the electromagnetic spectrum. These terms are not limited to visible light, but may also include invisible regions of the electromagnetic spectrum including, without limitation, the terahertz (THz), infrared (IR), and ultraviolet (UV) regions. In some embodiments, the present techniques may be used with electromagnetic radiation having a center wavelength ranging from 175 nanometers (nm) in the deep ultraviolet to about 300 micrometers (um) in the terahertz range, for example, from about 400 nm at the blue end of the visible spectrum to about 1550 nm at telecommunication wavelengths, or between about 400 nm and about 650 nm to match the spectral range of typical red-green-blue (RGB) color filters. However, these wavelength ranges are provided for illustrative purposes, and that the present techniques may operate beyond these ranges.
The present description generally relates to techniques for lens position determination using depth imaging.
The present techniques may be used in various applications. Non-limiting examples of possible fields of application include, to name a few, consumer electronics (e.g., mobile phones, tablets, laptops, webcams, and notebooks, gaming, virtual and augmented reality, photography), automotive applications (e.g., advanced driver assistance systems, in-cabin monitoring), industrial applications (e.g., inspection, robot guidance, object identification and tracking), and security and surveillance (e.g., motion tracking; traffic monitoring; drones; agricultural inspection with aerial and ground-based drones).
Various aspects and implementations of the present techniques are described below with reference to the figures.
Referring to
The depth imaging system 100 illustrated in
The provision of an angle-sensitive optical encoder such as a TDM 108 between the imaging lens 106 and the image sensor 112 can impart the depth imaging system 100 with 3D imaging capabilities, including depth sensing capabilities. This is because the TDM 108 is configured to diffract the light 102 received thereon into diffracted light 110, whose intensity pattern is spatially modulated in accordance with the angle-of-incidence distribution of the received light 102. It is appreciated that the angle-of-incidence distribution of the received light 102 is affected by the passage of the received light 102 through the imaging lens 106. The underlying image sensor 112 is configured to sample, on a per-pixel basis, the intensity pattern of the diffracted light 110 in the near-field to provide image data conveying information indicative of the angle of incidence of the received light 102. The image data may be used or processed in a variety of ways to provide multiple functions including, but not limited to, 3D depth map extraction, 3D surface reconstruction, image refocusing, and the like. Depending on the application, the image data may be acquired as one or more still images or as a video stream.
The structure, configuration, and operation of imaging devices using transmissive diffraction grating structures in front of 2D image sensors to provide 3D imaging capabilities are described in co-assigned international patent applications PCT/CA2017/050686 (published as WO 2017/210781), PCT/CA2018/051554 (published as WO 2019/109182), and PCT/CA2020/050760 (published as WO 2020/243828), as well as in the following master's thesis: Kunnath, Neeth, Depth from Defocus Using Angle Sensitive Pixels Based on a Transmissive Diffraction Mask (Master's thesis, McGill University Libraries, 2018). The contents of these four documents are incorporated herein by reference in their entirety. It is appreciated that the theory and applications of such diffraction-based 3D imaging devices are generally known in the art, and need not be described in detail herein other than to facilitate an understanding of the present techniques.
In the embodiment illustrated in
The term “diffraction grating”, or simply “grating”, refers herein to a structure or material having a spatially modulated optical property and which is configured to spatially modulate the amplitude and/or the phase of an optical wavefront incident thereon. The spatially modulated optical property, for example, a refractive index modulation pattern, defines the grating profile. In some embodiments, a diffraction grating may include a periodic arrangement of diffracting elements, such as alternating ridges and grooves, whose spatial period, the grating period, is substantially equal to or longer than the center wavelength of the optical wavefront incident thereon. Diffraction gratings may also be classified as “amplitude gratings” or “phase gratings”, depending on the nature of the diffracting elements. In amplitude gratings, the perturbations to the incident wavefront caused by the grating are the result of a direct amplitude modulation, while in phase gratings, these perturbations are the result of a modulation of the relative group velocity of light caused by a spatial variation of the refractive index of the grating structure or material. In several embodiments disclosed herein, the diffraction gratings are phase gratings, which generally absorb less light than amplitude gratings, although amplitude gratings may be used in other embodiments. In general, a diffraction grating is spectrally dispersive, if only slightly, so that different wavelengths of an incident optical wavefront may be diffracted differently. However, diffraction gratings exhibiting a substantially achromatic response over a certain operating spectral range exist and can be used in some embodiments.
The diffraction grating 116 in
The imaging lens 106 is disposed between the scene 102 and the TDM 108. The imaging lens 106 is configured to receive the light 102 from the scene 104 and focus or otherwise direct the received light 102 onto the TDM 108. The imaging lens 106 can define an optical axis 128 of the imaging system 100. Depending on the application, the imaging lens 106 may include a single lens elements or a plurality of lens elements. In some embodiments, the imaging lens 106 may be a focus-tunable lens assembly. In such a case, the imaging lens 106 may be operated to provide autofocus, zoom, and/or other optical functions.
The image sensor 112 includes an array of photosensitive pixels 130. The pixels 130 are configured to detect electromagnetic radiation incident thereon and convert the detected radiation into electrical signals that can be processed to generate image data conveying information about the scene 104. In the illustrated embodiment, each pixel 130 is configured to detect a corresponding portion of the diffracted light 110 produced by the TDM 108 and generate therefrom a respective pixel response. The pixels 130 may each include a light-sensitive region and associated pixel circuitry for processing signals at the pixel level and communicating with other electronics, such as a readout unit. In general, each pixel 130 may be individually addressed and read out. In the illustrated embodiment, the pixels 130 are arranged in an array of rows and columns defined by two orthogonal pixel axes, although other arrangements may be used in other embodiments. In some embodiments, the image sensor 112 may include hundreds of thousands or millions of pixels 130, for example, from about 1080×1920 to about 6000×8000 pixels. However, many other sensor configurations with different pixel arrangements, aspect ratios, and fewer or more pixels are contemplated. Depending on the application, the pixels 130 of the image sensor 112 may or may not be all identical. In some embodiments, the image sensor 112 is a CMOS or a CCD array imager, although other types of photodetector arrays (e.g., charge injection devices or photodiode arrays) may also be used. The image sensor 112 may operate according to a rolling or a global shutter readout scheme, and may be part of a stacked, backside, or frontside illumination sensor architecture. Furthermore, the image sensor 112 may be implemented using various image sensor architectures and pixel array configurations, and may include various additional components. Non-limiting examples of such additional components include, to name a few, microlenses, color filters, color filter isolation structures, light guides, pixel circuitry, and the like. The structure, configuration, and operation of such possible additional components are generally known in the art and need not be described in detail herein.
In some embodiments, the imaging system 100 may be implemented by adding or coupling the TDM 108 on top of an already existing image sensor 112. For example, the existing image sensor 112 may be a conventional CMOS or CCD imager. In other embodiments, the imaging system 100 may be implemented and integrally packaged as a separate, dedicated, and/or custom-designed device incorporating therein all or most of its hardware components, including the imaging lens 106, the TDM 108, and the image sensor 112. In the embodiment depicted in
The array of pixels 130 may be characterized by a pixel pitch 132. The term “pixel pitch” refers herein to the center-to-center distance between nearest-neighbor pixels. In some embodiments, the pixel pitch 132 may range between about 0.7 μm and about 10 μm, although other pixel pitch values may be used in other embodiments. The pixel pitch 132 is defined along the grating axis 118. Depending on the application, the pixel pitch 132 may be less than, equal to, or greater than the grating period 120. For example, in the illustrated embodiment, the grating period 120 is twice as large as the pixel pitch 132. However, other grating-period-to-pixel-pitch ratios, R, may be used in other embodiments. Non-limiting examples of possible ratio values include, to name a few, R≥2; R=(n+1), where n is a positive integer; R=2n, where n is a positive integer; R=1; R=2/(2n+1), where n is a positive integer, for example, n=1 or 2; and R=n/m, where n and m are positive integers larger than two and m>n, for example, n=3 and m=4.
In the embodiment illustrated in
Referring still to
The processor 134 can implement operating systems, and may be able to execute computer programs, also known as commands, instructions, functions, processes, software codes, executables, applications, and the like. While the processor 134 is depicted in
The memory 136, which may also be referred to as a “computer readable storage medium” is configured to store computer programs and other data to be retrieved by the processor 134. The terms “computer readable storage medium” and “computer readable memory” refer herein to a non-transitory and tangible computer product that can store and communicate executable instructions for the implementation of various steps of the techniques disclosed herein. The memory 136 may be any computer data storage device or assembly of such devices, including a random-access memory (RAM); a dynamic RAM; a read-only memory (ROM); a magnetic storage device, such as a hard disk drive, a solid state drive, a floppy disk, and a magnetic tape; an optical storage device, such as a compact disc (CD or CDROM), a digital video disc (DVD), and a Blu-Ray™ disc; a flash drive memory; and/or any other non-transitory memory technologies. The memory 136 may be associated with, coupled to, or included in the processor 134, and the processor 134 may be configured to execute instructions contained in a computer program stored in the memory 136 and relating to various functions and operations associated with the processor 134.
Referring to
In operation of the imaging system 100, the diffraction grating 116 receives the light 102 from the scene 104 on its input side, and diffracts the received light 102 to generate, on its output side, diffracted light 110 that travels toward the image sensor 112 for detection by the pixels 1301-1306. The diffracted light 110 has an intensity pattern that is spatially modulated based, inter alfa, on the geometrical and optical properties of the diffraction grating 116, the angle of incidence θ of the received light 102, and the position of the observation plane (e.g., the image sensor 112, or an intermediate optical component, such as a microlens array, configured to relay the diffracted light 110 onto the pixels 1301-1306). In the example illustrated in
The Talbot effect is a near-field diffraction effect in which plane waves incident on a periodic structure, such as a diffraction grating, produce self-images of the periodic structure at regular distances behind the periodic structure. The self-images can be referred to as Talbot images. The main distance at which self-images of the periodic structure are observed due to interference is called the Talbot length zT. In the case of a diffraction grating having a grating period g, the Talbot length zT may be expressed as follows: zT=λ/[1−(1−λ2/g2)1/2], where λ is the wavelength of the light incident on the grating. This expression simplifies to zT=2g2/λ when g is sufficiently large compared to λ. Other self-images are observed at integer multiples of the half Talbot length, that is, at nzT/2. These additional self-images are either in-phase (if n is even) and out-of-phase by half of the grating period (if n is odd) with respect to the self-image observed at zT. Further sub-images with smaller periods can also be observed at smaller fractional values of the Talbot length. These self-images are observed in the case of amplitude gratings.
In the case of phase gratings, such as the one depicted in
In the example illustrated in
Another property of Lohmann self-images is that they shift laterally along the grating axis 118 upon varying the angle of incidence θ of the received light 102, while substantially retaining their period and shape. This can be seen from a comparison between the intensity pattern of the diffracted light 110 illustrated in
Referring to
It is appreciated that since the intensities I+ and I− vary in a complementary way as a function of θ, their sum Isum remains, in principle, independent of θ. In practice, Isum can be controlled to remain largely independent of θ, or at least symmetrical with respect to θ (i.e., so that Isum(θ)=Isum(−θ). The summed pixel response, Isum, is similar to the signal that would be obtained by the pixels 1301-1306 in the absence of the diffraction grating 116, and thus can provide 2D intensity image information, with no or little angle-dependent information encoded therein. The differential pixel response, Idiff, varies asymmetrically as a function of θ and represents a measurement of the angle-of-incidence information encoded into the diffracted light 110 by the diffraction grating 116. The pixel responses I+, I−, Isum, and Idiff may be expressed mathematically as follows:
where I0 is the intensity of the incident light, m is a modulation depth parameter, and β is an angular sensitivity parameter. For example, in
Equations (2) and (3) imply that each summed pixel response Isum is obtained by summing one odd pixel response I+ and one even pixel response I−, and that each differential pixel response Idiff is obtained by subtracting one even pixel response I− from one odd pixel response I+. Such an approach may be viewed as a 2×1 binning mode. However, other approaches can be used to determine summed and differential pixel responses Isum and Idiff, for example, a 2×2 binning mode (e.g., Isum=I1++I1−+I2++I2− and Idiff=I1+−I1−+I2+−I2−, where I1± is a first pair of odd and even pixel responses and I2± is an adjacent second pair of odd and even pixel responses), or a convolution mode (e.g., using a kernel such that Isum and Idiff have the same pixel resolution as I+ and I−). In this regard, the term “differential” is used herein to denote not only a simple subtraction between two pixel responses, but also a more complex differential operation from which a difference between two or more pixel responses is obtained. Furthermore, although the example of
The summed and differential pixel responses, Isum and Idiff, may be processed to provide depth information about the scene 104. In some embodiments, the summed and differential pixel responses Isum and Idiff from all the odd-even pixel pairs or groups may be used to provide a TDM disparity map. The TDM disparity map is made of a set of TDM disparities, dTDM, one for each odd-even pixel pair or group (or TDM pixel pair or group). The TDM disparity map is representative of the difference between the viewpoint of the scene 104 provided by the odd pixels 1301, 1303, 1305 and the viewpoint of the scene 104 provided by the even pixels 1302, 1304, 1306. Stated otherwise, the odd pixel responses I+ and the even pixel responses I− can provide two slightly different views of the scene 104, separated by an effective TDM baseline distance corresponding to the distance between the odd and even pixels of each pair. The TDM disparity map can be processed to generate a depth map of the scene 104.
Returning to
The TDM disparity dTDM conveys relative depth information about the scene 102 but generally does not directly provide absolute depth information. Referring to
In some embodiments, the absolute depth, zd, of an object 138 in a scene 104 can be related to the TDM disparity dTDM as follows:
where STDM is a depth sensitivity parameter associated with the TDM 108, and zf is the focus distance of the imaging system 100. It is appreciated that Equation (4) relates relative depth information contained in dTDM to absolute depth information contained in zd. The depth sensitivity parameter STDM can depend on various factors including, but not limited to, different parameters of the imaging lens 106 (e.g., focal length, f-number, optical aberrations), the shape and amplitude of the angular response of the TDM 108, the size of the pixels 130, and the wavelength and polarization of the incoming light 102. The depth sensitive parameter STDM may be determined by calibration. The focus distance zf is the distance along the optical axis 128 computed from the center of the imaging lens 106 to the focus plane, which is the object plane that is imaged in-focus at the sensor plane of the image sensor 112. The sensor plane is at a distance zs, from the center of the imaging lens 106. The focus distance zf and the lens-to-sensor distance zs, may be related by the thin-lens equation as follows:
where f is the focal length of the imaging lens 106. In some embodiments, the focal length f may range from about 1 mm to about 50 mm, the lens-to-sensor distance zs may range from about 1 mm to about 50 mm, and the focus distance zf may range from about 1 cm to infinity. In some embodiments, the lens-to-sensor distance zs may be slightly longer than the focal length f, and the focus distance zf may be significantly longer than both the focal length f and the lens-to-sensor distance zs.
It is appreciated that an accurate determination of the object distance zd from the TDM disparity dTDM assumes that the focus distance zf, and thus the lens-to-sensor distance zs, is or can be known in a precise manner. If this is not case, the accuracy of depth determination may be degraded or otherwise adversely impacted, the impact being generally more important at larger values of zd. This means that a calibration curve of dTDM versus 1/zd can provide reliable depth determination only if the position of the imaging lens 106 has remained unchanged since calibration or, if the position of the imaging lens 106 has changed since calibration, that its current value can be known accurately to update the calibration curve. Such changes in the position of the imaging lens 106 may be intentional (e.g., changes during autofocus operations) or unintentional (e.g., changes due to wear or other factors causing drifts over time in lens position). It has been found that both intentional and unintentional lens position changes in TDM-based imaging systems may be difficult to ascertain in a reliable way, which may adversely affect the accuracy of absolute depth estimates obtained from relative depth information (e.g., the TDM disparity dTDM) derived from optical measurements (e.g., the pixel responses 1±). The capability of knowing or adjusting the lens-to-sensor distance zs, in a precise manner is also relevant in applications that involve adjusting the focus distance zf of the imaging system 100 to a target or desired value. This is because adjusting the focus distance zf generally involves moving the imaging lens 106 with respect to the image sensor 106 to a lens-to-sensor distance zs that corresponds to the target value of the focus distance zf, for example, using Equation (5).
It is appreciated that Equations (1) to (3) introduced above are defined for one particular angle of incidence. However, in practice, each pixel receives light from a cone of angles of incidence. Referring to
where θCRA-lens is the CRA function of the imaging lens 106, θCRA-shift is a CRA shifting or correction function, if any, applied by laterally shifting the TDM 108 relative to the pixels 130 along the grating axis 118 as compared to their relative alignment at pixel positions where θCRA-lens=0, and ΔCRA=θCRA-lens−θCRA-shift is the CRA mismatch. In Equation (6), the variable θ and, thus, the integral bounds −θmin and θmax are defined with respect to θCRA-lens, where θmax and θmin are positive numbers. The terms θCRA-lens, θCRA-shift, θmin, and θmax all depend on pixel position (Hx, Hy) but that this dependence is made implicit in Equation (6) to simplify the notation. Typically, θCRA-lens is zero at the center of the array and increases, linearly or nonlinearly, toward the edge of the array. This is illustrated in
The CRA shifting function θCRA-shift may be selected so as to substantially match the lens CRA function θCRA-lens across the image sensor 112 in order to reduce the CRA mismatch ΔCRA and adverse effects associated therewith. For example, the CRA mismatch ΔCRA is equal to zero at both the centered and off-centered pixel positions depicted in
It is appreciated that the values of θCRA-lens, θmin, and θmax in Equation (6) depend not only on pixel position (Hx, Hy), but also on the position of the imaging lens 106 relative to the TDM 108 and the image sensor 112. Reference is made in this regard to
Referring to
In the present description, the image to a uniform scene can be referred to as a “uniform-field image”, a “uniform-scene image”, or a “flat-field image”. In some embodiments, the uniform-field image may be acquired by imaging a plain, non-textured background (e.g., a white or uniformly colored wall, screen, or surface) under uniform illumination, or by placing an optical diffuser in front of the imaging system during image capture. In such embodiments, the uniform-field image may be acquired in a laboratory or manufacturing setting or another controlled environment. In other embodiments, the uniform-field image may be obtained from one or several images acquired during normal operation of the imaging system by a user. In such embodiments, the uniform-field image may be acquired “on the fly” or automatically, without user intervention or knowledge. Depending on the application, the uniform-field image can be a full-frame image (i.e., obtained using all the pixels of the image sensor) or a partial-frame image (i.e., obtained using a reduced number of pixels of the image sensor).
It is appreciated that the image of a uniform scene is expected not to contain depth cues, such as edges and textures. This means that in the ideal case, the image of a uniform scene acquired by a TDM-based imaging system 100 should not contain any angle-dependent information about the scene 104 itself. Rather, any angle-dependent information encoded in the uniform-field image by the TDM 108 may be attributed to the characteristics of components of the imaging system 100 involved in the image capture process, including the imaging lens 106, the TDM 108, and the image sensor 112. From Equations (2), (3), and (6), a uniform-field image iUF(Hx, Hy) captured by a TDM-based imaging system 100 can be defined as follows:
In Equation (7), the image data captured by the image sensor 112 for generating the uniform-field image iUF(Hx, Hy) includes a first set of pixel responses i+(Hx, Hy) corresponding to a first set of pixels 130O of the image sensor 112 and a second set of pixel responses i−(Hx, Hy) corresponding to a second set of pixels 130E of the image sensor 112, where i+(Hx, Hy) and i−(Hx, Hy) vary differently from each other as a function of angle of incidence due to the angle-dependent encoding provided by the TDM 108. In particular, as noted above, the pixel responses i+(Hx, Hy) of the first set have magnitudes that increase as the angle of incidence increases, and where the pixel responses of the second set i−(Hx, Hy) have magnitudes that decrease as the angle of incidence increases. In such embodiments, the step 204 of generating the uniform-field image iUF(Hx, Hy) includes substeps of computing a plurality of summed pixel responses isum(Hx, Hy) based on a sum operation between the first set of pixel responses i+(Hx, Hy) and the second set of pixel responses i−(Hx, Hy); computing a plurality of differential pixel responses idiff(Hx, Hy) based on a difference operation between the first set of pixel responses i+(Hx, Hy) and the second set of pixel responses i−(Hx, Hy); and determining an intensity value of each image point (Hx, Hy) of the uniform-field image iUF(Hx, Hy) as a ratio of a respective one of the plurality of differential pixel responses idiff(Hx, Hy) to a respective one of the plurality of summed pixel responses isum(Hx, Hy), where the plurality of intensity values of the plurality of image points defines the intensity profile of the uniform-field image.
It is appreciated that defining iUF(Hx, Hy) in terms of the ratio between idiff(Hx, Hy) and isum(Hx, Hy) can make iUF(Hx, Hy) independent, or substantially independent, of the intensity I0 of the light 102 received from the uniform scene 104. It is also appreciated that when Isum and Idiff are given by Equations (2) and (3), respectively, Equation (7) can be written as:
which can be simplified as follows when β(θmax+ΔCRA) and β(−θmin+ΔCRA) are much smaller than 1:
It is appreciated that iUF(Hx, Hy) in Equation (8) is equal to zero at any pixel position where the interval [−θmin, θmax] is symmetric about θ=−ΔCRA, where, as noted above, θ is defined with respect to θCRA-lens. This is generally the case at the center of the array, where ΔCRA is expected to be zero (since both θCRA-lens and θCRA-shift are expected to be zero) and θmin and θmax are expected to be equal. However, iUF(Hx, Hy) is generally not equal to zero at an arbitrary value of (Hx, Hy). This is illustrated in
It is also appreciated that the intensity profile of iUF(Hx, Hy) as a function of pixel position (Hx, Hy) may convey information about the position of the imaging lens 106 with respect to the TDM 108 and the image sensor 112. This is illustrated in
In some embodiments, information about the position plens as of the imaging lens 106 may be retrieved from measurements of iUF(Hx, Hy) using an absolute determination method, that is, without reference to prior lens position information. In such an approach, lens position information may be determined from measurements of iUF(Hx, Hy) using a model relating iUF(Hx, Hy) to the position plens of the imaging lens 106. Depending on the application, such a model may be obtained from experimental data, analytical calculations, numerical calculations, or any combination thereof. In some embodiments, the position of the imaging lens 106 may be represented as five-parameter function plens(z, x, y, φx, φy), where z is an axial position along the optical axis 128 of the imaging lens 106, x is a first lateral position along a first lateral direction (x-axis) perpendicular to the optical axis 128, y is a second lateral position along a second lateral direction (y-axis) perpendicular to both the optical axis 128 and the x-axis, φx is a first tilt angle relative to the x-axis, and φy is a second tilt angle of the imaging lens relative toy-axis. In some embodiments, the axial position of the imaging lens 106 may be defined with respect to the image sensor 112, in which case z=zs, the lens-to-sensor distance introduced above. The model relating iUF(Hx, Hy) to plens(z, x, y, φx, φy) may include a model of how θCRA-lens, θmin, and θmax vary as a function of plens(z, x, y, φx, φy). It is noted that if the imaging lens 106 is not rotationally symmetric about the optical axis 128, the position of the imaging lens 106 may be represented as a six-parameter function plens(z, x, y, φx, φy, φz), where φz is a rotation angle of the imaging lens 106 about the optical axis 128.
In other embodiments, information about the position plens of the imaging lens 106 may be retrieved from measurements of iUF(Hx, Hy) using a relative determination method, that is, with reference to prior lens position information. For example, in some embodiments, determining the current lens position information plens,1(Zs,1, x1, y1, φx,1, φy,1) about the imaging lens 106 from measurements of iUF,1(Hx, Hy) can include a step of providing reference data relating an intensity profile of a reference uniform-field image iUF,0(Hx, Hy) to reference lens position information plens,0(zs,0, z0, y0, φx,0, φy,0), and a step of determining the current lens position information plens,1 from the intensity profile of the generated uniform-field image iUF,1(Hx, Hy) based on the reference data. The reference data relating iUF,0(Hx, Hy) to plens,0 may be obtained at an earlier time (e.g., at the time of manufacture or later) by performing measurements of iUF,0(Hx, Hy) under conditions where the lens position plens,0 is known or assumed to be known. In such embodiments, the current lens position plens,1 may be determined by performing the following operations: obtaining measurements of iUF,1(Hx, Hy) at the current lens position plens,1; determining, from the measurements of iUF,1(Hx, Hy) at the current lens position plens,1 and the reference data relating iUF,0 (Hx, Hy) to plens,0 an intensity profile difference δiUF(Hx, Hy, δplens)=iUF,1(Hx, Hy)−iUF,0(Hx, Hy) between the intensity profile of iUF,1(Hx, Hy) and the intensity profile of iUF,0(Hx, Hy); determining lens position variation information δplens,1=plens,1−plens,0 from the intensity profile difference δiUF(Hx, Hy,δplens); and determining the current lens position information plens,1 from the reference lens position information plens,0 and the lens position variation information δplens, using plens,1=plens,0+δplens. In some embodiments, plens,0 may be unknown or not known precisely enough. In such embodiments, the lens position variation information δlens determined from the intensity profile difference δiUF(Hx, Hy, δplens) may be used to provide relevant current lens position information, for example, for adjusting calibration.
In some embodiments, the function δiUF(Hx, Hy, δplens) representing the difference between iUF,1(Hx, Hy) at the current lens position plens,1 and iUF,0(Hx, Hy) at the reference lens position plens,0 can be reasonably assumed to be proportional to the difference between the lens CRA function θCRA-lens,1(Hx, Hy) at plens,1 and the lens CRA function θCRA-lens,0(Hx, Hy) at plens,0. In such a case, the determination of the lens position variation information δplens from the intensity profile difference δiUF(Hx, Hy, δplens) can include a step of relating the intensity profile difference δUF(Hx, Hy, δPlens) to a variation δθCRA-lens(Hx, Hy, δplens) in the CRA function of the imaging lens 106, as follows:
where A is a scaling or proportionality factor, which can depend, inter alfa, on the f-number of the imaging lens 106 and parameters of the TDM 108. Equation (10) assumes that (i) I±(θ) and θCRA-shift(Hx, Hy) do not change over time, which assumes that the position of the TDM 108 relative to the image sensor 112 remains the same during operation of the imaging system 100, and that (ii) the impact of δplens on θmin(Hx, Hy) and θmax(Hx, Hy) can be neglected compared to the impact of δplens on ΔCRA(Hx, Hy), which has been found to be typically the case in practice. The scaling factor A relating δiUF(Hx, Hy, δplens) to δθCRA-lens(Hx, Hy, 6δplens) may be obtained by calibration, for example, by acquiring a set of uniform-field images iUF(Hx, Hy) at a corresponding set of known lens positions plens, or using a model. It is noted that when Equation (3) applies, the scaling factor A can be approximated as βm.
From Equation (10), the determination of the lens position variation information δplens from the intensity profile difference δiUF(Hx, Hy, δplens) can include a step of determining the lens position variation information δplens from the variation δθCRA-lens(Hx, Hy, δplens) in the CRA function of the imaging lens 106 using a model relating lens CRA function variations to changes in lens position. In some embodiments, the model can be established based on a nominal CRA function θCRA-lens*(Hx, Hy) of the imaging lens 106 defined at a nominal position (e.g., at zs=f and x=y=φx=φy=0) of the imaging lens 106. For example, in some embodiments, θCRA-lens,1(Hx, Hy, plens,1) and θCRA-lens,0(Hx, Hy, plens,0) may be expressed in terms of a nominal CRA function θCRA-lens*(Hx, Hy) of the imaging lens 106 as follows:
θCRA-lens,1(Hx, Hy, plens,1)=θCRA-lens*(ftan(γx,1−φx,1)−x1, ftan (γy,1−φy,1)−y1)+φx,1, (11)
θCRA-lens,0(Hx, Hy, plens,0)=θCRA-lens*(ftan(γx,0−φx,0)−x0, ftan(γy,0−φy,0)−y0)+φx,0, (12)
where γx,1=arctan(Hx/za,1), γy,1=arctan(Hy/zs,1), γx,0=arctan(Hx/zs,0), and γy,0=arctan(Hy/zs,0), the x-axis is parallel to the grating axis 118 of the TDM 108, and the nominal CRA function θCRA-lens*(Hx, Hy) is defined at zs=f and x=y=φx=φy=0. The relationship for γx,1, γy,1, γx,0, and γy,0 can be found from
Equation (13) provides a convenient analytical expression that relates variations in lens CRA, δθCRA-lens(Hx, Hy), to variations in lens position, δplens(δzs, δx, δy, δφx, δφy), and that depends weakly on absolute lens position information (i.e., the terms zs,0, γx,0, φx,0, γy,0, φy,0). Results calculated from the model provided by Equation (13) have been found to agree well with experiment over a range of lens positions.
In some embodiments, uniform field data δiUF(Hx, Hy) may be obtained from measurements. The measured uniform field data δiUF(Hx, Hy) may be related to lens CRA variation information δθCRA-lens(Hx, Hy) using Equation (10). In turn, the δθCRA-lens(Hx, Hy) information may be processed using Equation (13) to provide relative lens position information δplens(δzs, δx, δy, δφx, δφy), from which the current lens position plens,1=plens,0+δplens may be obtained. The current lens position plens,1 may be used to provide a current value zs,1 for the lens-to-sensor distance zs, which may be used in Equation (5) to obtain a current value zf,1 for the focus distance zf. In turn, the current focus distance value zf,1 may be used in Equation (4) to derive absolute depth information (e.g., object distance zd) from relative depth information [e.g., TDM disparity dTDM obtained from Isum and Idiff given by Equations (2) and (3)]. It is appreciated that in such embodiments, the current lens position plens,1 and/or the change in lens position δplens can be used to correct, adjust, or otherwise update the calibration curve relating dTDM (relative depth information) to zd (absolute depth information), for example, by replacing the focus distance value zf,0 stored in memory with the current focus distance value zf,1 or by adjusting the calibration curve to compensate for a lens tilt δφx or δφy.
Depending on the application, the imaging lens 106 may be a fixed-focus lens or a variable-focus lens. In fixed-focus applications, the lens position determination method disclosed herein may be used to correct or compensate for inadvertent variations over time in the position of the imaging lens 106 with respect to the image sensor 112 (e.g., due to thermal expansion, mechanical shocks, positional drifts in lens or sensor components, and the like). In variable-focus lens applications, for example, applications with autofocus capabilities, the lens position determination method may be used at suitable intervals, for example, every time focus is changed, to reassess the position of the imaging lens 106.
In some embodiments, the focus distance zf can be changed to provide more accurate or robust absolute depth estimation. For example, relative depth information is often more accurate when zd is close to zf, so that bringing the focus to the object to be measured can increase the accuracy of the depth measurement. In such embodiments, the lens position determination method disclosed herein can be used to provide a current value of zs, from which a current value of zf can be obtained and used to adjust the calibration curve relating dTDM to zd.
In some embodiments, the present techniques may be used in a laboratory or manufacturing environment, for example, to obtain initial calibration information relating TDM disparity dTDM to object distance zd at various focus distances zf, or to provide feedback or monitoring on lens alignment during production. In such embodiments, the scene 104 that is imaged to provide the image data used to generate a uniform-field image for lens position determination can be a plain, non-textured background representative of a uniform field. In this case, the captured image data can include at least one image of the scene (e.g., a single or a few images), and the uniform-field image can be generated from the at least one image without performing a prior step of removing depth cues from the at least one image. In some embodiments, the present techniques may be used in stereoscopic and multiscopic imaging systems to verify and ensure that the lens positions of the two or more cameras are the same or within some predetermined tolerance of one another.
In some embodiments, the present techniques may be used during normal operation of an imaging system by a user. In such embodiments, obtaining an image of a blank, texture-less scene may not be practical or even possible. In some embodiments, in order to address this issue, uniform-field image data iUF(Hx, Hy) may be acquired not by directly capturing an image of a uniform background scene, but by capturing one or several “normal” images, where each normal image need not be an image of a uniform scene, and by processing the one or more normal images to obtain uniform-field image data iUF(Hx, Hy) that is representative of a uniform scene. In some embodiments, the processing of the one or more normal images can include steps of removing depth cues from the one or more normal images, combining the one or more normal images of the scene with removed depth cues into a fused image, and generating the uniform-field image iUF(Hx, Hy) from the fused image of the scene. As noted above, in such embodiments, the uniform-field image data iUF(Hx, Hy) may be acquired on the fly or automatically, without user intervention or knowledge.
In some embodiments, the acquisition of uniform-field image data iUF(Hx, Hy) may include a step of capturing a stream of images during normal use of the imaging system by a user, and a step of combining and processing the normal images together to remove therefrom edges and other depth information, thereby obtaining the uniform-field image data iUF(Hx, Hy). The processing may involve applying a gradient operator for edge detection (or another suitable operator or combination of operators) on the normal images and removing therefrom detected edges in accordance with a specified threshold condition. For example, in some embodiments, the number of normal images may range from about 3 to about 300, for example from about 3 to about 50, although less or more normal images may be used in other embodiments. In some embodiments, it may not be efficient to keep a buffer of ten or more normal images in memory. In such embodiments, a moving or rolling average approach may be implemented in which the averaged or fused uniform-field image data iUF(Hx, Hy) is obtained progressively, one or a few normal images at a time, so that only one or a few normal images are kept in memory at any given time. This approach can improve buffer management efficiency, which can be advantageous in applications where memory size and/or access bandwidth are limited. Once the uniform-field image data iUF(Hx, Hy) has been reconstructed on the fly, it can be used such as described above, for example, for determining lens position and focus distance information, adjusting calibration data relating relative and absolute depth information, and/or determining absolute depth information from relative depth information obtained from TDM disparity measurements.
Referring to
For simplicity, several embodiments described above include TDMs provided with a single diffraction grating and, thus, a single grating orientation. However, it is appreciated that, in practice, TDMs will generally include a large number of diffraction gratings and may include multiple grating orientations. Referring
In some embodiments, the first set of diffraction gratings 116a and the second set of diffraction gratings 116b may be interleaved in rows and columns to define a checkerboard pattern. It is appreciated, however, that any other suitable regular or irregular arrangements of orthogonally or non-orthogonally oriented sets of diffraction gratings may be used in other embodiments. For example, in some variants, the orthogonally oriented sets of diffraction gratings may be arranged to alternate only in rows or only in columns, or be arranged randomly. Other variants may include more than two sets of diffraction gratings. Providing TDMs with multiple grating orientations can improve lens position determination by providing multiple sources of lens position information.
In addition, although several embodiments described above include TDMs provided with one-dimensional, binary phase gratings formed of alternating sets of parallel ridges and grooves defining a square-wave grating profile, other embodiments may use TDMs with other types of diffraction gratings. For example, other embodiments may use diffraction gratings where any, some, or all of the grating period, the duty cycle, and the step height are variable; diffraction gratings with non-straight features perpendicular to the grating axis; diffraction gratings having more elaborate grating profiles; 2D diffraction gratings; photonic crystal diffraction gratings; and the like. The properties of the diffracted light may be tailored by proper selection of the grating parameters. Furthermore, in embodiments where TDMs include multiple sets of diffraction gratings, the diffraction gratings in different sets need not be identical. In general, a TDM may be provided as a grating tile made up of many grating types, each grating type being characterized by a particular set of grating parameters. Non-limiting examples of such grating parameters include the grating orientation, the grating period, the duty cycle, the step height, the number of grating periods, the lateral offset with respect to the underlying pixels and/or color filters, the grating-to-sensor distance, and the like.
It is appreciated that the lens position determination techniques described herein can be used in focus adjustment processes, for example, to adjust the focal distance of an imaging system to a specified value. In some applications, the present techniques can be used to complement, enhance, or replace conventional focus adjustment methods, such as phase detection autofocus methods or other autofocus methods based on the contrast or texture detection. Such conventional methods can involve measuring a disparity or another metric of an object in a scene, and using the measured disparity or metric as a feedback mechanism to adjust the lens position and bring the object in focus. Such a process can take a few frames to converge to the desired focus distance. Furthermore, if the scene is very blurry or has low contrast, conventional autofocus methods tend to be slower and less accurate, or may even become inoperable.
Referring to
The focus adjustment method 300 of
The focus adjustment method 300 also includes a step 304 of determining a target lens-to-sensor distance zs,target between the imaging lens and the image sensor, the target lens-to-sensor distance zs,target corresponding to the target focus distance zf,target. In some embodiments, the step 304 of determining the target lens-to-sensor distance zs,target corresponding to the target focus distance zf,target can include computing the target lens-to-sensor distance zs,target from the target focus distance zf,target and a focal length f of the imaging lens according to the thin-lens equation as follows: zs,target=[(1/f)-(1/zf,target)]−1 [see also Equation (5)].
The focus adjustment method 300 further includes performing a lens position adjustment operation 306 including one or more iterative cycles. In
As noted above, precisely adjusting the position of a lens to a specific lens position can be challenging or impractical to achieve. In order to verify the accuracy of the lens moving step 308, the lens position adjustment operation 306 can include a step 310 of determining, using a method of determining lens position information such as described herein, current lens position information about the imaging lens. For example, returning to
Once the current lens-to-sensor distance zs,current has been determined, the method 300 of
In some embodiments, the focus adjustment method disclosed herein can be used to kick-start a conventional autofocus system when the imaging lens is very far from its optimal position and the image is very blurry. Knowing the usual working range of the imaging system, the lens position determination techniques disclosed herein can be applied to bring the focus closer to the working range, at which point the conventional autofocus system can take over.
In some embodiments, a user may be required to input where he or she wants the focus to be. If it happens to be a texture-less region of the scene, conventional autofocus methods may not work, and using the lens position determination techniques disclosed herein in a focus adjustment method may be relevant.
In some applications, it may be desired or required that autofocus be faster or nearly instantaneous, which can be an issue with conventional autofocus methods that require a few frames for the autofocus operation to converge. In some embodiments, the present techniques can be used to directly set the focus distance to correspond to a predicted position of an object in an upcoming frame. Such embodiments can forego waiting for a picture of the object to adjust the autofocus feedback loop, and instead can use a lens positioning feedback loop such as described above (see lens position adjustment operation 306 of the focus adjustment method 300 depicted in
Although several embodiments described above use TDMs as optical encoders of angle-of-incidence information, other embodiments may use other types of optical encoders with angle encoding capabilities. Referring to
The provision of the microlens array 144 interposed between the image sensor 112 and the scene 104, where each microlens 146 covers two or more pixels 130 of the image sensor 112, can impart the imaging system 100 with 3D imaging capabilities, including depth sensing capabilities. This is because the different pixels 130 in each pixel pair or group under a given microlens 146 have different angular responses, that is, they produce different pixel responses in response to varying the angle of incidence of the received light 102, similar to the odd and even pixel responses i±(Hx, Hy) introduced above with respect to TDM-based implementations. These different pixel responses i±(Hx, Hy) may be processed to provide uniform-field image data iUF(Hx, Hy) which can be used such as described above to provide position information about the imaging lens 106. In such implementations, the pixels 130 of the image sensor 112 may be referred to as phase detection pixels. It is appreciated that although the embodiment of
For example, in some embodiments, each microlens 146 may cover a group of 2×2 pixels 130, as depicted in
It is appreciated that the structure, configuration, and operation of imaging devices using phase detection pixels, quad-pixel technology, dual-pixel technology, half-masked pixel technologies, and other approaches using microlens arrays over pixel arrays to provide 3D imaging capabilities are generally known in the art, and need not be described in detail herein other than to facilitate an understanding of the techniques disclosed herein.
In accordance with another aspect of the present description, there is provided a non-transitory computer readable storage medium having stored thereon computer readable instructions that, when executed by a processor, cause the processor to perform a method of lens position determination in an imaging system including an imaging lens, an image sensor having an array of pixels, and an angle-sensitive optical encoder interposed between the imaging lens and the image sensor. For example, the method can include receiving image data from a scene captured by the image sensor, the image sensor being configured to detect, with the array of pixels, light incident from the scene having passed through the imaging lens and the optical encoder, the optical encoder being configured to encode angle-dependent information about the incident light having passed therethrough in the captured image data in accordance with the angular response; generating a uniform-field image from the captured image data, the uniform-field image having an intensity profile that varies with image position in accordance with the angle-dependent information encoded in the captured image data; and determining current lens position information about the imaging lens from the intensity profile of the generated uniform-field image.
In accordance with another aspect of the present description, there is provided a computer device including a processor and a non-transitory computer readable storage medium such as described herein and being operatively coupled to the processor.
Numerous modifications could be made to the embodiments described above without departing from the scope of the appended claims.
The present application claims priority to U.S. Provisional Patent Application No. 63/137,791 filed on Jan. 15, 2021, the disclosure of which is incorporated herein by reference in its entirety.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/CA2022/050018 | 1/7/2022 | WO |
Number | Date | Country | |
---|---|---|---|
63137791 | Jan 2021 | US |