The present invention relates generally to imaging, and more particularly to the imaging of a scene using disjoint light sensors.
In recent years, digital imaging has grown tremendously in both complexity and capability, and has achieved relatively high resolution to produce desirable images for a variety of applications. For example, cameras, video cameras, robotics, biometrics, microscopes, telescopes, security and surveillance applications use digital imaging extensively.
For many applications, pixel scaling in image sensors has aimed at increasing spatial resolution for a given optical format. However, as pixel size is approaching the practical limits of optics, the improvement in resolution is diminishing. In addition, there are other undesirable conditions such as those relating to increased cross-talk and decreased fill factor which require further costly process modifications to remedy even if the optics can resolve to the scaled dimensions. Indeed, scaling pixels into the sub-micron range has not been readily desirable.
Another aspect of digital imaging that has been challenging relates to the determination of the depth of field as applicable, for example, to three-dimensional (3D) imaging. In recent years, several 3D imaging systems implementing a variety of techniques such as stereo-vision, motion parallax, depth-from focus, and light detection and ranging (LIDAR) have been reported. In particular, multi-camera stereo vision systems infer depth using parallax from multiple perspectives, while time-of-light sensors compute depth by measuring the delay between an emitted light pulse (e.g., from a defocused laser) and its incoming reflection. However, these systems are relatively expensive, consume high power, and require complex camera calibration. Moreover, imaging approaches that use active illumination, although accurate, generally employ large pixels and thus exhibit relatively low spatial resolution for a given format.
The above characteristics have continued to present challenges to digital imaging applications.
The present invention is directed to overcoming the above-mentioned challenges and others related to the types of applications discussed above and in other applications. These and other aspects of the present invention are exemplified in a number of illustrated implementations and applications, some of which are described below, shown in the figures and characterized in the claims section that follows.
According to an example embodiment of the present invention, a scene is imaged using disjoint sensors beyond a designated focal plane to obtain multiple views of common points in the focal plane. For the common points, the multiple views are processed to compute a depth of field, and the computed depth of field is used to generate an image.
According to another example embodiment of the present invention, a scene is imaged using a monolithic sensor arrangement having an array of optically disjoint sensors with sensor-specific integrated optics to re-image a focal plane formed from the scene.
In another example embodiment of the present invention, an integrated image sensor circuit includes a plurality of disjoint sensors in a sensor plane, each sensor including local integrated optics and pixels to re-image a focal plane formed from a scene. For certain applications, the circuit arrangement further includes an image data processing circuit that processes data from each sensor to generate an image and/or compute the depth of field of one or more objects in the scene.
The above summary is not intended to describe each illustrated embodiment or every implementation of the present invention.
The invention may be more completely understood in consideration of the detailed description of various embodiments of the invention that follows in connection with the accompanying drawings, in which:
While the invention is amenable to various modifications and alternative forms, examples thereof have been shown by way of example in the drawings and will be described in detail. It should be understood, however, that the intention is not to limit the invention to the particular embodiments shown and/or described. On the contrary, the intention is to cover all modifications, equivalents, and alternatives falling within the spirit and scope of the invention.
The present invention is believed to be applicable to a variety of different types of imaging applications. While the present invention is not necessarily so limited, various aspects of the invention may be appreciated through a discussion of examples using this context.
In connection with various example embodiments, a multi-aperture image sensor images a scene using multiple views of the same points in a primary focal plane. The magnification of local optics and the pixel size for each sensor set spatial resolution, which is greater than the aperture count. For various embodiments, small pixels are used to facilitate high depth resolution with a relatively consistent (or otherwise limited) spatial resolution. Many embodiments also involve the extraction of a depth map of the scene by solving a correspondence problem between the multiple views of the same points in the primary focal plane.
In some embodiments, a multi-aperture image sensor architecture is used in color imaging. A per-aperture color filter array (CFA) is used to mitigate or largely eliminate color aliasing and crosstalk problems similar to those that can result from the large dielectric stack heights relative to pixel size in image sensors such as sub-micron CMOS image sensors.
In connection with another example embodiment, a single-chip multi-aperture image sensor simultaneously captures a two-dimensional (2D) image and three-dimensional (3D) depth map of scenes in high resolution. Depth is inferred using multiple, localized images of a focal plane, and without necessarily implementing an active illumination source and/or requiring complex camera calibration. Certain applications involve the use of a lens or lens arrangement to focus a scene to a particular focal plane, and other applications do not involve lenses. The sensor is readily formed using one or more manufacturing approaches such as lithographic definition with semiconductor processing, and is thus amenable to the manufacture of low cost, miniaturized imaging and/or vision systems.
Turning now to the figures,
Two points 140 and 142 of a scene are shown with exemplary light rays traced through the objective lens, focused upon the focal plane 130 to points 141 and 143 (respectively corresponding to points 140 and 142). The rays diverge beyond the focal plane and are sensed or otherwise detected at separate image sensors in the multi-aperture sensor arrangement 110.
Different sensors in the multi-aperture sensor arrangement, each sensor having one or more pixels that receive light corresponding to a particular aperture, are responsive to light in the scene by generating light data. This data is processed, relative to the position of the sensors and the optics (objective 120) to compute an image, using different views of the common points (141, 143) in the focal plane 130.
The objective lens 120 effectively has no aperture from the perspective of the multi-aperture sensor arrangement 110. This facilitates a relatively complete description of the wavefront in the focal plane. The amount of depth information that can be extracted from image data relates to the total area of the objective lens that is scanned by the multi-aperture sensor arrangement 110 and can be accordingly set to suit particular applications.
An aperture device 230 is labeled by way of example and is used here for illustration, with the following discussion applicable to more (or all) of the aperture devices in the integrated sensor 200. The aperture device 230 includes a k×k array 232 of pixels and readout circuitry 234. Where appropriate, local optics are implemented for each aperture device, such as in the dielectric stack of an integrated circuit including the aperture devices and using refractive microlenses or diffractive gratings patterned in metal layers in the integrated circuit. Each aperture device is separated from adjacent aperture devices, which facilitates the implementation of the readout circuitry and local optics immediately adjacent to the pixel array.
In these contexts, the “k×k” pixel array refers to an array of pixels that may be set to a number of pixels to suit a particular application, such as a 2×2 array. Other embodiments use pixel arrays having more or fewer rows than columns (e.g., a 1×2 array), or pixels that are not in an array and/or sporadically arranged. The independent apertures with localized pixels facilitate aggressive pixel scaling, which is useful for achieving high depth resolution.
The aperture device 230 is coupled to the sequencer 210 via connectors 250 and 252, and is further connected to analog-digital converter (ADC) 240 via column connector 242. The integrated sensor 200 includes a multitude of such aperture devices, respectively coupled to rows and columns that may include more or fewer rows and columns as shown (and represented by dashed lines representing expansion or reduction). Hence, the “n” rows and “m” columns may respectively include more or fewer rows or columns, relative to that shown in
In connection with certain embodiments, unlike a conventional imaging system where the lens focuses the image directly onto the image sensor, the image is focused above the sensor plane (e.g., at focal plane 130 in
In some embodiments, each aperture device is optically disjoint, or separated, from all other aperture devices. That is, each aperture device operates independent from other aperture devices, and each aperture provides an independent signal representing light detected by the aperture. In some applications, each aperture device is separated by a distance that mitigates and/or prevents any crosstalk between the sensors for light detected thereby. In some applications, each aperture device is separated a physical structure such as a wall of stacked via and metal layers. In addition and in connection with various embodiments, the array of aperture devices are monolithic (e.g., formed on a common silicon chip). In other embodiments, the entire integrated sensor is monolithic.
The aperture device 205 includes a light sensitive CCD array 260 of k×k pixels and a light shielded CCD array of k x k storage cells. The pixels in the entire image sensor are set to integrate simultaneously via global control. Such global shuttering is useful to achieve highly accurate correspondence between apertures in extracting depth. After integration, the charge from each pixel array is shifted into its local frame buffer 270 and then read out through a floating diffusion node via a follower amplifier at 280. A correlated double sampling scheme is used for low temporal and fixed-pattern noise. Global readout is performed using hierarchal column lines that may be similar, for example, to hierarchal bit/word lines used in low-power SRAM. Column-level ADCs digitize sensor data for fast readout and/or on-chip parallel processing.
In connection with various example embodiments, the depth of objects in a scene is obtained using the disparity between apertures (with the term apertures referring generally to an image sensor arrangement, such as those discussed in connection with
Beginning with
Marginal ray traces for the same point as seen from the two different apertures are shown in
Considering the parameters A, B, C, D and L shown in
When local optics for each sensor are implemented in a dielectric stack (e.g., as discussed above), the distance D0 is approximately equal to the dielectric stack height of the fabrication process, or the nominal distance to the secondary focal plane from the local optics. Thus, given the stack height D0, the focal length g is set during fabrication to meet the desired N0 value. For instance, one such application is implemented as follows. The focal length, f, is set to 10 mm, A0=1 m, D0=10 μm, and g=8 μm. These parameters yield a nominal magnification factor of N0=¼. This value is set to achieve a desired amount of overlap between aperture views.
The distance D is determined as a function of A by fixing the parameters to meet the nominal magnification factor N0. To characterize the depth of field, the deviation in D is found from the nominal position D0 where it is in best focus. Since the local optics collect light across the entire aperture, the focus is degraded with deviation in D. By the lens law,
1/ƒ=1/A+1/B, and 1/g=1/C+1/D.
Using the magnification factors M and N, B and D are solved to obtain
B=(M+1)ƒ, and D=(N+1)g, or D=(1/g−1/C)−1
This establishes a relationship between A and D in terms of the magnification M. Consistent with the above expression for D, as the object moves to infinity, the total movement in the primary focal plane is M0f. The total movement in the secondary focal plane is further reduced from this value, which results in a relatively wide range of focus. For instance, the movement in the primary focal plane is 100 μm for an object distance of 1 m to infinity. This translates into a mere 1:5 μm deviation in D. The magnification factor N varies from ¼ to 1/16.
Referring to
C/L=D
0/Δ.
Using the lens law for A as a function of B and making the substitution B=E−C=B0+C0−C, we obtain
Solving for A in terms of Δ gives the depth equation
A characteristic of this sensor is that the amount of depth information available is a strong function of the object distance (the closer the object, the higher the depth resolution). This is quantified by solving for A in terms of M, which gives
As M increases, Δ rapidly approaches its limit of D0L/(M0ƒ+D0/N0).
The rate of change in Δ with A, i. e., δΔ/δA, can be computed as a function of δB/δA and δΔ/δC. Setting δC=−δB at the focal plane, it can be shown that
For example, with a 0.5 μm pixel pitch, the displacement between apertures can be estimated to within about 0.5 μm resolution. Further, assuming L/D=2, the incremental depth resolution δA is approximately 4 cm at A0=1 m and 4 mm at A0=10 cm. Decreasing pixel size allows for more accuracy in δΔ, leading to higher depth resolution.
Spatial resolution and related pixel size and sensor placement for various applications as discussed herein are set to suit various applications. In some applications, overlapping fields of view are established by setting the magnification factor of local optics to N<1. With each pixel projected up to the focal plane by a factor of 1/N, spatial resolution is reduced by 1=N2. Thus, the total available resolution is about mnk2N2. Using a 16×16 array of 0.5 μm pixels with a magnification factor of N0=¼, the maximum resolution is 16 times greater than the aperture count itself, but 16 times lower than the total number of pixels.
The actual spatial resolution is limited by optical aberrations and ultimately by diffraction. The minimum spot size W for a diffraction limited system is about λ/NA, where NA=ni sin θ is the numerical aperture of the local optics, ni is the index of refraction of the dielectric and θ is the angle between the chief and the marginal rays. Using the Rayleigh criterion, the minimum useful pixel pitch is commonly assumed to be half the spot size. Assuming ni≈1.5 in the dielectric stack, NA can be about 0.5, which gives a spot size of about 1 μm. Thus, scaling the pixel beyond 0.5 μm does not increase spatial resolution. Although no further increase in spatial resolution is feasible beyond the diffraction limit, depth resolution continues to improve as long as there are features with sufficiently low spatial frequencies. The disparity between apertures can be measured at smaller dimensions than set by the diffraction limit.
To individually address a row of FFT-CCDs, an RS signal 540 is applied in conjunction with a decoded ROW signal 542. MUX blocks (with block 550 labeled by way of example) contain column control, bias circuits, and support for external testing of the analog signal chain and single column analog readout. All ADCs share an output bus, which is controlled by the signal COL and buffered at the IO.
For certain applications, in order to achieve a high well capacity, the image is captured in 2 fields in the vertical direction. This allows for large barriers between pixels where charge is confined on every other electrode. A transfer from the vertical register to the horizontal register is performed with one charge packet at a time using ripple charge transfer. An STI region is used to create isolation between arrays and also serves as the area for contacts to the non-silicided electrodes.
At the time end of the integration time, a FRAME TX event 730 moves (transfers) all charge from the active region into the buffered region controlled by electrodes V<35:18>. The buffered region is sampled 740 while a new image is integrated at 750 in the active region after the flush 760. Sampling begins with resetting the floating diffusion (FD) by applying RT globally. Then each of the columns of the image sensor is sampled by the ADCs simultaneously before going onto the next row. After all rows are sampled, a TX signal is applied and the process repeats.
The values for each pixel are used for correlated sampling. This sequence of events may, for example, eliminate the need to implement a row decoder for each of the electrodes in the frame buffer and horizontal CCD regions. The digital values latched in the ADC are read out (e.g., 770) one column at a time by scanning through COL values during the integration cycle.
In some embodiments, a vertical to horizontal transfer process such as described above is carried using one or more of the following approaches. In one application, individual charge packets are transferred into a H-CCD (horizontal CCD) one at a time. This involves moving all other charge forwards and then backwards in a V-CCD (vertical CCD). In another application, all even column charge packets are moved into a H-CCD while odd packets move backwards. The odd column charge packets are the moved into the H-CCD.
During the FLUSH phase, CCD pixel arrays are depleted of charge through VO by sequencing V. During integration, pixel array electrodes are held at an intermediate voltage of 1V, and at the end of integration, the accumulated charge packets in the CCD pixel arrays are transferred one row at a time to the frame buffers using ripple charge transfer. A 2V potential difference between electrodes is used to achieve complete transfer between stages.
Frame buffer readout is performed while an INTEGRATION cycle takes place after a FLUSH cycle. The readout sequence begins with a global reset of all FD nodes through an RT pulse. The reset voltages are then digitized by per-column ADCs one aperture row at a time, and are stored off-chip. Next, one charge packet from each frame buffer is shifted to its H-CCD, which is performed by initially shifting one row of charge to the V35 electrode.
One of the horizontal electrodes (HI5 by way of example) is then set to a high voltage, which causes a partial charge transfer. Next, a vertical electrode (V34 by way of example) is brought to an intermediate voltage while another vertical electrode (V35 by way of example) is slowly brought to a lower voltage. The charge is transferred to H15 because the fringing field induced by H15 is larger than that induced by V34. This completes the transfer for the desired charge packet while all other charge is moved back under V34. The charge in the H-CCD is then ripple-shifted to H0 and onto the FD node while pulsing TX high. The pixel values on the FD nodes are digitized one row at a time by ADCs and stored off-chip where digital CDS is performed. This sequence is repeated until all stored pixel values for one field are read out. In some implementations, this readout approach is used to eliminate a need to implement a row decoder for each of the frame buffer and H-CCD electrodes.
Various other example embodiments are applicable to implementation in connection with those described in Appendices A-E of the above-referenced provisional application, which form part of the provisional application, and which are fully incorporated herein by reference. Other example embodiments are applicable to implementation in connection with those described in Keith Fife, Abbas El Gamal, H. -S. Philip Wong, “A 0.5_m Pixel Frame-Transfer CCD Image Sensor in 10 nm CMOS” (IEEE 2007); and in Keith Fife, Abbas El Gamal, H.-S. Philip Wong, “A 3MPixel Multi-Aperture Image Sensor with 0.7 μm Pixels in 0.11 μm CMOS” (IEEE International Electron Devices Meeting, pp. 1003-1006, December 2007; and further in ISSCC 2008, Session 2, Image Sensors & Technology, 2.3 (2008 IEEE International Solid-State Circuits Conference), all of which are fully incorporated herein by reference.
While the present invention has been described above and in the claims that follow, those skilled in the art will recognize that many changes may be made thereto without departing from the spirit and scope of the present invention. Such changes may include, for example, different arrangements of sensors, different spacing to facilitate selected sensor image overlap, different processing circuits and different optics. Other changes involve one or more aspects as described in the incorporated provisional patent application and the appendices that form part of the application. These and other approaches as described in the contemplated claims below characterize aspects of the present invention.
This patent document claims the benefit, under 35 U.S.C. § 119(e), of U.S. Provisional Patent Application Ser. No. 60/972,654, entitled “Light Sensor Arrangement” and filed on Sep. 14, 2007, which is fully incorporated herein by reference.
Number | Date | Country | |
---|---|---|---|
60972654 | Sep 2007 | US |