Optics can be thought of as performing mathematical operations transforming light intensities from different incident angles to locations on a two-dimensional image sensor. In the case of focusing optics, this transformation is the identity function: each angle is mapped to a distinct corresponding point on an image sensor. When focusing optics are impractical due to size, cost, or material constraints, the right diffractive optic can perform an operation other than the identity function that is nonetheless useful to produce a final image. In such cases the sensed data may bear little or no resemblance to the captured scene, but may nevertheless provide useful visual acuity to detect elements of interest in a monitored scene. A digital image can be computed from the sensed data if an application calls for image data that is sensible to human observers.
The detailed description is illustrated by way of example, and not by way of limitation, in the figures of the accompanying drawings and in which like reference numerals refer to similar elements and in which:
Each subgrating gi,j produces a similar interference pattern for capture by the subset of nine underlying pixels p0,0 through p2,2. As a result, the overall pixel array collectively samples nine similar nine-pixel patterns, each a relatively low-resolution representation of the same scene. A processor (
Imaging device 100 has a large effective aperture, as every point in an imaged scene illuminates the entire light-receiving surface of grating 105. Three-by-three arrays of subgratings, subarrays, and the digest are shown for ease of illustration. Practical embodiments may have many more pixels and subgratings, different ratios of pixels to subgratings, and different ratios of subgratings to subarrays. Some examples are detailed below.
Returning to
There are only nine subgratings gi,j and eighty-one pixels px,y in this simple illustration, but a practical embodiment can have e.g. hundreds or thousands of subgratings overlaying dense collections of pixels. An embodiment with 16×16 (256) subgratings over 1,024×1,024 (1M) pixels might produce a 4K (1M/256) pixel digest with much lower noise than apparent in the raw 1M pixel data, and that places a proportionally lower data burden on image processing and communication. In other embodiments the digest can correspond to more or fewer tiles, and the number of pixels per subgrating and digest can be different.
A low-power microcontroller or digital signal processor with a reasonable clock and very modest RAM (2 kB or so) can compute a digest alongside the pixel array, relaying the digest at a modest transfer rate over a lightweight protocol such as Serial Peripheral Interface (SPI) or Inter-integrated Circuit (I2C). An exemplary embodiment, not shown, includes a 54×30 array of subgratings over a full HD sensor (1920×1080 pixels) with a 2-micron pixel pitch. A digest pooled from all 1,620 (54×30) subarrays would yield a massive noise reduction and improved low-light sensitivity. If a higher framerate is needed, by exploiting a sensor with a rolling shutter to scan across the scene either vertically or horizontally, 54x spatial oversampling can be obtained at 30x temporal oversampling. Any intermediate scheme is also available, as are schemes with short-pulsed LEDs for portions of the rolling exposure, where multiple single-frame differential measurements are possible.
Subgratings gi,j are periodic and identical in the preceding examples. Performance may be enhanced by warping the subgratings such that the point-spread functions (PSFs) from the different subgratings lack translation symmetry. Deliberately detuning the grating thickness can also lead to an asymmetry in the point source strengths, also breaking symmetry. In these cases, the warpings can themselves have a longer-scale periodicity, and the digest can reflect the diversity of signals over the largest optically relevant periodicity.
Phase gratings of the type used for subgratings gi,j are detailed in U.S. Pat. No. 9,110,240 to Gill and Stork, which is incorporated herein by this reference. Briefly, and in connection with subgrating g2,2, subgratings gi,j are of a material that is transparent to IR light. The surface of subgratings gi,j includes transparent features 110 (black) and 115 (white) that define between them boundaries of odd symmetry. Features 110 are raised in the Z dimension (normal to the view) relative to features 115, and are shown in black to elucidate this topography. As detailed below in connection with
Adjacent features 110 and 115 form six illustrative odd-symmetry boundaries 304, each indicated using a vertical, dashed line. The lower features 115 induce phase retardations of half a wavelength (π radians) relative to upper features 110. Features 305 and 310 on either side of each boundary exhibit odd symmetry. The different phase delays produce curtains of destructive interference separated by relatively bright foci to produce an interference pattern on pixel array 303. Features 305 and 310 are of uniform width in this simple illustration, but vary across each subgrating gi,j and collection of subgratings as shown, for example, in the example of
Imaging device 300 includes an integrated circuit (IC) device 315 that supports image acquisition and processing. IC device 315 includes a processor 320, random-access memory (RAM) 325, and read-only memory (ROM) 330. ROM 330 can store a digital representation of the point-spread function (PSF) of subgratings gi,j, possibly in combination with array 303, from which a noise-dependent deconvolution kernel may be computed. ROM 330 can also store the deconvolution kernel along with other parameters or lookup tables in support of image processing.
Processor 320 captures digital image data from the pixel array, accumulates the intensity values from homologous pixels into a digest (not shown), and uses the digest with the stored PSF or deconvolution kernel to e.g. compute images and extract other image data. In other embodiments the digest can be generated locally and conveyed to an external resource for processing. Processor 320 uses RAM 325 to read and write data, including e.g. digest 150 of
A point source of light (not shown) far from imaging device 300 will produce nearly the same response on each subarray, with each response shifted about eight degrees horizontally or vertically for each successive subgrating. Array 303 captures raw intensity data, which is passed on a per-pixel basis to processor 320. Processor 320 computes a running sum of intensity values from each pixel in each homologous set of pixels. Computing a 35×35 pixel digest (70 um subarray pitch divided by the 2 um pixel pitch) of intensity values yields an extremely low-noise rendition of the light intensity for each pixel beneath a typical instance of a subgrating. Processor 320, possibly in combination with computational resources external to imaging device 300, can perform machine learning on the digest for e.g. pattern classification and gesture recognition.
Imaging device 300 may have defective pixels, either known a priori or deduced from their values that are incompatible with expectations. Processor 320 can be programmed to ignore defective pixels through simple logical tests, and at the application level one or two “spare” tiles can be physically made, their data used only in the event of encountering a bad pixel during the streaming of the data. Thus the same number of pixels may be used to generate each entry in a digest even if a few bad pixels are rejected.
Computational focusing (potentially at multiple depth planes simultaneously) can be achieved by keeping a digest of pixel data with a slightly larger array pitch than the optical tile. For example, a 36×36 digest of the scene generated by a 70×70 um subgrating would be sensitive to objects a little closer than infinity (22 mm in the case of the device in
If an object at infinity would produce a signal with 35-pixel horizontal periodicity, accumulating with a (say) 36-pixel repetition over a block of sensor 1260 pixels wide (1260=35*36) should produce exactly no signal in expectation since each of the 36 elements of the digest gets precisely the same complement of contributions from the 35-pixel-wide true optical repetition. Any signal generated by this averaging comes from an object measurably closer than infinity, and statistically significant deviations from a uniform distribution indicate a nearby object. This type of sensing may be useful in range finding for e.g. drone soft landing.
The forgoing examples exhibit integer-valued pixel pitches. However, non-integer effective spatial pitches are also realizable by e.g. choosing to skip a pixel column every second tile (for half-integer expected repetitions), once or twice per block of pixels three tiles wide (for integer +⅓ and integer +⅔ expected periods), etc. Another approach is to use spatial interpolation to accommodate effective non-integer expected pixel shifts.
Imaging device 600 can be used to image proximate scenes, such as to track eye movement from the vantage point of a glasses frame. The large effective aperture is advantageous for this application because active illumination power is best minimized for power consumption and user safety. Excessive depth of field can pose a problem for eye tracking in such close proximity because eyelashes can obscure the view of the eye. The spatial pitch of imaging device 600 the separation of gratings 610 allows device 600 to exhibit depth sensitivity that can blur lashes relative to the eye. For example, given an eye relief distance of 22 mm, the pitch of repeated structures would be 135 pixels*22.329 mm/22 mm=137 pixels, not the 135 pixels of objects at infinity. Eyelashes on average 7 mm closer than the eye features have a pixel repetition pitch of 135 pixels*15.329/15=138 pixels, so averaging over 14 horizontal tiles blurs the effect of an eyelash horizontally by 14 pixels. This 14 pixel blur effectively blurs eyelashes by about 4.9 degrees, or 1.27 mm at a 15 mm standoff, which is about 4× more blur than an eyelash is thick. The optical effective distance of the glints or first Purkinje reflections of the light sources can be greater than the optical effective distance to the pupil features. Purkinje images may be best focused under the assumption of a 135.5 pixel repetition pitch. If it is desirable to form in-focus imaging of both glints and pupil features, special processing can compute separate subarray signals from a single data stream, one assuming a 137-pixel pitch and the other assuming a 135.5-pixel repetition pitch.
In some embodiments with multiple illumination conditions, the mean intensity of any one illumination condition is not per se useful; only the difference between illumination conditions is required by the application. In this case, to reduce the quantity of memory required, processor 705 can increment a digest under a first illumination condition and decrement it under a subsequent condition. More complicated schedules of incrementing or decrementing the digest 150 can also be desirable, for example to detect only the polarization-dependent reflectivity of a scene where some background light may also be polarized, a fixed-polarization illumination source such as a polarized LED could be used in conjunction with a liquid crystal over the sensor. Here, four conditions are relevant: LED on or off, in conjunction with aligned or cross polarization of the liquid crystal. One relevant signal could be the component of the reflected LED light that is polarization-dependent, calculated as the sum of the parallel polarization with the LED on and the crossed polarization with the LED off, minus the sum of the crossed polarization with the LED on and the parallel polarization with the LED off. This digest can be accumulated differentially as described above, requiring only one quarter of the memory that would be required if each digest were to be stored independently.
In some embodiments in which image sensor 735 includes gratings or tiles, processor 705 can pulse LED 711 such that some pixel rows are exposed for a full pulse, some for no pulse, and others get an intermediate pulse exposure. Throwing out intermediate rows, any one-tile-high collection of pixels with a desired exposure contains some permutation of all the data needed, even if the “top” of the logical canonical tile occurs somewhere in the middle of the rows of a certain desired illumination state. Shifting the address of the pixels accumulated recovers correct data in the canonical arrangement, wasting no rows of data. Processor 705, aware of the timing of frame capture, can ensure that various active illumination states occur at known locations within one frame.
Pixel array 910 uses an exposure process of a type commonly referred to as “rolling shutter” in which rows of pixels 920 are sequentially scanned. To capture a single frame, the pixels of the top row become photosensitive first and remain so over an exposure time. Each successive row becomes photosensitive a row time after the prior row, and likewise remains photosensitive over the exposure time. The time required to scan all rows, and thus acquire data from all pixels 920 in array 910, is referred to as a “frame time.” The speed with which frames can be delivered is referred to as the “frame rate.”
Sensor 900 exploits the rolling shutter to provide successive digests 925 at a digest rate greater than the frame rate. In this example, sensor 900 accumulates and issues a digest 925 for each two rows of subgratings 905, or twelve rows of pixels 920. The digest rate is thus five times the frame rate of pixel array 910 alone. Arrows 930 show how two rows of pixels 920 are accumulated into one five-element row of each digest 925. Row exposure times are normally longer than row times in rolling-shutter devices, and arrows 930 are not intended to limit the order in which pixels are read or their sample values accumulated. In other embodiments a digest can accumulate sample data bridging multiple full or partial frames. The size and aspect ratio of digest 925 may be different, and are adjustable in some embodiments.
Sensors in accordance with other embodiments can employ exposure processes other than rolling shutter. For example, sensors that scan an entire image simultaneously are referred to as “global shutter.” Some embodiments accumulate multiple digests from a global-shutter to measure spatial disparity for relatively nearby objects. For example, a 50×60 pixel global-shutter array can be divided into four 25×30 pixel quadrants, and each quadrant in turn divided into a 5×5 array of 5×6 pixel subarrays of homologous pixels under similar subgratings. Sample values from the twenty-five (5×5) subarrays in each quadrant can then be accumulated into a single 5×6 value digest to provide four laterally displaced images of the same scene. Objects close to the grating will appear offset from one another in the four digests, and these offsets can be used to calculate e.g. the position of the object relative to the sensor. As in the rolling-shutter embodiment, the number, size, and shape of digests can be different, and may be adjustable.
Pixel arrays can include superfluous pixel structures that are e.g. defective or redundant and not used for image capture. Such superfluous structures are not “pixels” as that term is used herein, as that term refers to elements that provide a measurement of illumination that is used for image acquisition. Redundant pixels can be used to take multiple measurements of pixels in equivalent positions, reducing noise.
While the subject matter has been described in connection with specific embodiments, other embodiments are also envisioned. For example, imaging devices that do not not employ apertures can be used in applications that selectively defocus aspects of a scene, and the wavelength band of interest can be broader or narrower than those of the foregoing examples, and may be discontinuous. A linear array of pixels can be used alone or in combination with other linear arrays to sense one-dimensional aspects of a scene from one or more orientations. Moreover, if a given subgrating exhibits some Fourier nulls, then two or more general regions that potentially have different aspect ratios, grating designs or orientations, or any combination of the above, could provide independent measurements of the scene. Other variations will be evident to those of skill in the art. Therefore, the spirit and scope of the appended claims should not be limited to the foregoing description. Only those claims specifically reciting “means for” or “step for” should be construed in the manner required under the sixth paragraph of 35 U.S.C. §112.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/US18/13150 | 1/10/2018 | WO | 00 |
Number | Date | Country | |
---|---|---|---|
62448513 | Jan 2017 | US | |
62539714 | Aug 2017 | US |