The present disclosure relates to light field imaging with transparent photodetectors.
The optical sensors in the vast majority of today's imaging devices are flat (two-dimensional) devices that record the intensity of the impinging light for three particular colors (red, green, and blue) at each pixel on the sensor. Because light is detected in only a single plane, all information about the direction of the light rays is lost. As a result, the recorded images are 2D projections of the actual object in real space, with a finite depth of field (i.e., only a limited region of the object space is actually in precise focus). The ultimate imaging system would produce a complete representation of the 3D scene, with an infinite depth of field. For any given wavelength, the light rays emanating from 3D objects in a scene contain five dimensions (5D) of information, namely the intensity at each location in space and the angular direction (θ, φ) of propagation. An imaging system will collect these light rays and propagate them to an optical sensor array. At any given plane in the system, the light distribution at a given wavelength (color), may be described via a 4D function, corresponding to the intensity of the light at each transverse position (x,y) and the direction of propagation described by angles (u,v). Such a 4D representation of the propagation though the imaging system is known as the light field; knowledge of the complete light field enables computational reconstruction of objects in the image space, for example digital refocusing to different focal planes, novel view rendering, depth estimation, and synthetic aperture photography. Indeed, the co-development of novel optical systems and computational photography is opening up exciting new frontiers in imaging science, well beyond the traditional camera and its biological inspiration, the eye.
Various schemes for light field imaging have been proposed and demonstrated. For example, one may employ an array of microlenses at the focal plane of the imaging lens, in conjunction with a 2D detector array, to obtain the angular information necessary to reconstruct the light field. The first prototype utilizing this approach was implemented in 2005, and imaging devices of this type are referred to as plenoptic cameras. This approach, however, has an inherent tradeoff of spatial resolution for angular resolution. Schemes incorporating programmable apertures, focal sweep cameras and other mask-based designs attempt to solve the low-resolution problem, but they either suffer from signal-to-noise limitations or require multiple images to be acquired and therefore are not suitable for recording dynamic scenes. Implementing a full-sensor-resolution, high SNR and real-time light field imaging system remains a challenging problem.
This section provides background information related to the present disclosure which is not necessarily prior art.
This section provides a general summary of the disclosure, and is not a comprehensive disclosure of its full scope or all of its features.
A light field imaging system with transparent photodetectors is presented. The light field imaging system includes: a stack of two or more detector planes, an imaging optic, and an image processor. Each of the two or more detector planes is arranged in a different geometric plane and the geometric planes are substantially parallel with each other. The detector planes include one or more transparent photodetectors, such that transparent photodetectors have transparency greater than fifty percent (at one or more wavelengths) while simultaneously exhibiting responsivity greater than one amp per watt. The imaging optic is configured to receive light rays from a scene and refract the light rays towards the stack of two or more detector planes, such that the refracted light rays pass through the transparent detector planes and the refracted light rays are focused within the stack of detector planes. The image processor is in data communication with each of the photodetectors in the stack of two or more detector planes and operates to reconstruct a light field for the scene (at one of more wavelengths) using the light intensity distribution measured by each of the photodetectors.
In one embodiment, each detector plane includes an array of photodetectors and each photodetector in a given array of photodetectors aligns with a corresponding photodetector in each of the other arrays of photodetectors.
In one aspect, a method is provided for reconstructing a light field from data recorded by a light field imaging system. The method includes: determining a transformation matrix that relates a light field from a given scene to predicted light intensity distribution as would be measured by a stack of detector planes in the light field imaging system; measuring the light intensity distribution of light propagating from an unknown scene at each detector plane in the stack of detector planes; and reconstructing a light field for the unknown scene using the transformation matrix and the measured light intensity from the unknown scene.
Further areas of applicability will become apparent from the description provided herein. The description and specific examples in this summary are intended for purposes of illustration only and are not intended to limit the scope of the present disclosure.
The drawings described herein are for illustrative purposes only of selected embodiments and not all possible implementations, and are not intended to limit the scope of the present disclosure.
Corresponding reference numerals indicate corresponding parts throughout the several views of the drawings.
Example embodiments will now be described more fully with reference to the accompanying drawings.
The key technology at the center of the proposed light field imaging system is a transparent photodetector. Present imaging systems employ a single optical sensor (photodetector array), usually at the focus of a lens system. Being made of silicon or similar material, the sensor is opaque, resulting in the loss of directional information on the light ray. If one could make highly sensitive but nearly transparent sensors, then multiple sensor arrays could be stacked along the path of the light rays, enabling directional information to be retained, and thus enabling computational reconstruction of the light field from data recorded by a single exposure. Recent breakthroughs in optoelectronic materials enable this new imaging paradigm to be realized. In particular, the discovery of graphene has sparked interest in a whole class of atomic layer crystals including graphene, hexagonal boron nitride, molybdenum disulphide and other transition metal dichalcogenides (TMDCs). Unlike conventional semiconductors, their ultimate thinness (one or few atomic layers), make them nearly transparent across the electromagnetic spectrum. Despite this, the strong light-matter interaction in these 2D atomic crystals makes them sensitive light sensors at the same time. It may seem a paradox that a sensitive detector could also be nearly transparent, but this combination of properties is a special property of these 2D electronic materials. For simplicity, the remaining description focuses on a single wavelength but the extension of the teachings in this disclosure to color images is readily understood by those knowledgeable in the art.
The imaging optic 11 is configured to receive light rays from a scene and refract the light rays towards the stack of detector planes 12, such that the refracted light rays pass through the detector planes 12. The refracted light rays are focused within the stack of detector planes 12. In one embodiment, the imaging optic 11 focuses the refracted light rays onto one of the detector planes. In other embodiments, the imaging optic 11 may focus the refracted light rays in between two of the detector planes 12. In the example embodiment, the imaging optic 11 is implemented by an objective lens and the light is focused onto the final detector plane 13. Other types of imaging optics are contemplated including but not limited to camera lens, metalens, microscope lens and zoom lens.
The detector planes 13 include one or more transparent photodetectors. In an example embodiment, the transparent photodetectors include a light absorbing layer and a substrate, where the light absorbing layer is comprised of a two-dimensional material and the substrate is comprised of a transparent material. As a result, the transparent photodetectors have transparency greater than fifty percent (and preferably >85%) while simultaneously exhibiting responsivity greater than one amp per watt (and preferably >100 amps per watt). Example constructs for transparent photodetectors are further described below. Examples of other suitable photodetectors are described by Seunghyun Lee, Kyunghoon Lee, Chang-Hua Liu and Zhaohui Zhong in “Homogeneous bilayer graphene film based flexible transparent conductor” Nanoscale 4, 639 (2012); by Seunghyun Lee, Kyunghoon Lee, Chang-Hua Liu, Girish S. Kulkarni and Zhaohui Zhong, “Flexible and transparent all-graphene circuits for quaternary digital modulations” Nature Communications 3, 1018 (2012); and by Chang-Hua Liu, You-Chia Chang, Theodore B. Norris and Zhaohui Zhong, “Graphene photodetectors with ultra-broadband and high responsivity at room temperature” Nature Nanotechnology 9, 273-278 (2014). Each of these article are incorporated in their entirety herein by reference.
In the example embodiment, each detector plane 12 includes an array of photodetectors. In some embodiments, each photodetector in a given array of photodetectors aligns with a corresponding photodetector in each of the other arrays of photodetectors. In other embodiments, photodetectors across different arrays do not necessarily align with each other. In any case, the light field imaging system records information related to the direction of propagation because rays are incident upon photodetectors across the stack of detector planes.
In operation, a bundle of light rays emitted from an object point are collected by the imaging optic 11. The imaging optic 11 refracts the light rays towards the stack of detector planes 12, such that the refracted light rays pass through at least one of the detector planes and are focused at some point within the stack of detector planes. Some of the light is absorbed by photodetectors in each of the intermediate detector planes. The sensors must absorb some of the light to obtain the intensity distribution in each (x,y) plane, but pass sufficient light that several detector planes can be positioned in front of the final detector plane 13.
The image processor 14 is in data communication with each of the photodetectors in the stack of two or more photodetectors and is configured to receive light intensity measured by each of the photodetectors. The image processor 14 in turn reconstructs a light field for the scene using the light intensity measured by each of the photodetectors. An example method for reconstructing the light field is further described below. The image processor may be specially constructed for the required purposes, or it may comprise a general-purpose computer selectively activated or reconfigured by a computer program stored on a computer readable medium that can be accessed by the computer. Such a computer program may be stored in a tangible computer readable storage medium, such as, but is not limited to, any type of disk including optical disks, CD-ROMs, magnetic-optical disks, read-only memories (ROMs), random access memories (RAMs), EPROMs, EEPROMs, magnetic or optical cards, application specific integrated circuits (ASICs), or any type of media suitable for storing electronic instructions, and each coupled to a computer system bus. Furthermore, the computers referred to in the specification may include a single processor or may be architectures employing multiple processor designs for increased computing capability.
To form a detector plane, the photodetectors are arranged in an array on a transparent substrate.
When the point source is very far from the imaging lens, the real image is completely out of focus on both of the front and the back graphene detector sheets. The point source is then moved towards the imaging lens with a linear stage. At some point, the real image of the point source will be perfectly focused on the front detector sheet while staying out of focus on the back detector sheet. Referring to
Next, a method is presented for reconstructing a light field from data recorded by the light field imaging system 10. Referring to
The reconstruction process then corresponds to an inversion of the forward model to determine the lightfield of an unknown object or scene. That is, light intensity profile for an unknown scene is measured at 62 by the light field imaging system. The light field for the unknown scene can be reconstructed at 63 using the measured light intensity profile and the forward model. In an example embodiment, the reconstruction is cast in the form of a least-squares minimization problem. The key subtlety in the solution is that there is a dimensionality gap: the light field is a 4D entity, while the focal stack array produces 3D of data. The proposed method accounts for this dimensionality gap as explained below.
For simplicity, an analysis is presented in a 2D space (2D lightfield). Consider a linear scene corresponding to a 1D object, with a reflectance pattern
using plane-plane parameterization. The lightfield first travels a distance of z to a lens with focal length f and is then imaged onto a 1D sensor that is distance F behind the lens as seen in
Next, consider the forward model. Under the paraxial approximation, the effect of lightfield propagation and imaging lens refraction can both be modeled in the form of
where A is the 2×2 optical transfer matrix. By the serial application of optical transfer matrices, the lightfield on the sensor is:
A camera coordinate re-parameterization is performed, making the x-axis be on the sensor (spatial axis) and the u-axis be on the lens (angular axis). The resulting lightfield in the camera now becomes
The formulation can be directly generalized to the 3D space (4D lightfield) with planar scene object (2D):
where H is defined as above and this can be rewritten in the form of simpler notations:
where the bold characters
represent the 2D spatial and angular vector, respectively.
The image formation i(x) is the integration of sensorcam(x, u) over the aperture plane:
i(x)=∫sensorcam(x, u)du
To track the computation, the lightfield is discretized as
where m and p are index vectors of dimension 2 that correspond to x and u, respectively, and the discrete image formation process now becomes
where g[m, p]=(s*t)(mΔx, pΔu),
s(x, u)=rectΔ
It is directly seen that each discrete focal stack images can be computed in the linear closed form, hence the complete process of 3D focal stack formation from the 4D lightfield can be modeled by the linear operation of =A·+n, where A is the forward model, n is the detection noise, and is the resulting measured focal stack.
The problem of reconstructing the scene lightfield from the focal stack data can then be posed as the least-square minimization problem of =min∥−A·∥22. While reference is made to least-squares minimization, other optimization techniques for solving this problem are also contemplated by this disclosure, such as those that include regularization.
There are two directions that are explored about such a camera system. First, through investigating the forward model A, the system is explicitly quantitatively evaluated with the design parameters such as number of detectors and the placement of them. Second, as the specific reconstruction is an ill-posed problem due to the dimensionality gap between 3D and 4D, a proper choice of regularization is required. With the qualitative study of reconstructed epipolar images on some simple scene, it is shown that a 4 dimensional total variation regularization can reduce the cross-epipolar interference and therefore further decreases the reconstruction error ∥recon−true∥22 and improves the visual quality of the image rendered from the reconstructed lightfield.
While the spatial relationship between photographs and lightfields can be understood intuitively, analyzing in the Fourier domain presents an even simpler view of the process of photographic imaging—a photograph is simply a 2D slice of the 4D light field. The proposed light-field imaging system is therefore analyzed in the Fourier domain. This is stated formally in the following Fourier slice photography theorem based on Lambertian scene and full aperture assumptions; the measurement at the dth detector is given by
where 4D{(x, y, u, v)}=L(fx, fy, fu, fv) and 2D{(x, y)}=L(fx, fy) are the 4D and 2D Fourier transforms respectively and the slicing operator Sd{.} is defined by
S
d
{F}(fx, fy):=F(αdfx, αdfy,(1−αd)fx,(1−αd)fy) (2)
{αdβ:αd ∈ (0,1], d=1, . . . , D} denotes a set of distances between the lens and the dth detector, β is a distance between a lens and the furthest detector to the lens (i,e., the D-th detector with αD=1), γ ∈ [0,1] is a transparency of the light detectors, fx, fy, fu, fv ∈ are frequencies and D is the number of detectors.
As noted above, the reconstruction problem of minimizing =min∥−A·∥22 is ill-posed and a proper regularization is sought that helps further decrease the reconstruction error ∥recon−true∥22. With the proposed scheme of the lightfield camera, now consider different scenarios of a planar scene object relative to the imaging system. One extreme case is that the object happens to be sharply imaged on one of the focal stack sheet (say the d-th detector). This is regarded as optimal detection, since the light from the object would disperse along the line in the frequency domain with slope αd/(1−αd) and be completely sampled by the d-th detector. The lightfield can be reconstructed with high quality using standard least-square minimization methods such as conjugate gradient (CG) descent even without any regularization. More commonly in normal operation, which is regarded as the typical case, the frequency-domain light distribution will fall somewhere between the sampling lines. In this case, minimizing the cost function without regularization would create artifacts on the images rendered from the reconstructed lightfield.
Because of the known “dimensionality gap” of 4D light field reconstruction from focal-stack sensor data, the ordinary least-squares approach will typically have multiple solutions. In such cases, the output of the CG algorithm can depend on the value used to initialize that iteration. To improve the quality of the reconstructed light field and to help ensure a unique solution to the minimization problem, it is often preferable to include a regularization term and use a regularized least-squares minimization approach, also known as a Bayesian method. In regularized least squares, one estimates the light field by =argmin∥−A·∥22R(), where R denotes a regularization function. One simple choice of the regularizer is a 4D total variation (TV) approach that sums the absolute differences between neighboring pixels in the light field, where the “neighbors” can be defined between pixels in the same x-y or u-v planes, and/or in the same epipolar images defined as 2D slices in the x-u or y-v planes.
TV regularization is based on the implicit model that the light-field is piecewise smooth, meaning that its gradient (via finite differences) is approximately sparse. There are other more sophisticated sparsity models that are also suitable for 4D light field reconstruction. For example, one can use known light fields (or patches thereof) as training data to learn the atoms of a dictionary D for a sparse synthesis representation: =Dz, where z denotes a vector of sparse coefficients. Alternatively, one could use training data to learn a sparsifying transform matrix W such that W denotes a sparse vector of transform coefficients. Both of these sparsifying methods can be implemented in an adaptive formulation where one learns D or W simultaneously with the reconstruction without requiring training data. Other sparsity models such as convolutional dictionaries can be used to define the regularizer R. Typically the effect of such regularizers is to essentially reduce the number of degrees of freedom of the lightfield to enable 4D reconstruction from focal stack data.
The ability to reconstruct images with high fidelity will depend on having sufficient 3D data obtained from the focal stack; hence the image reconstruction quality will improve with a larger number of detector planes. This may be seen from the Fourier slice photography theorem, which suggests that the number of 4D Fourier samples of the light-field image x increases as the number of detectors D increases, yielding improvement of reconstruction quality with D. The optimal positions of the D detector planes is not necessarily with equal spacing along the z axis. In fact, an unequal spacing can be optimal, as seen in the following calculation. A direct consequence of the Fourier slice photography theorem is that the dth focal stack sheet will radially sample the vx−vu and the vy−vv Fourier domains along a line with slope αd/(1−αd). Now consider a design of a lightfield camera of size β (distance between the last focal stack sheet and the imaging lens) for a working range between w1 and w2 (w1>w2) from the imaging lens. Point sources at w1,2 will be sharply imaged at αw
These confine a sector between θw
Without prior information about the scene (and therefore its 4D Fourier spectrum), one can arrange the N focal stack sheet locations such that they correspond to the radial sampling lines that have equal angular spacing by
within the sector (confined by the designed working range) as seen in
As illustrated in
1. Numerically solve β from the constraint of
2. Calculate θw
3. Determine the position of the d-th focal stack sheet by
As a numerical example, if the lightfield camera is expected to work in the range of 30 cm to 3 m, the focal length of the imaging lens is 50 mm and has 5 transparent detectors, then the detector planes would be placed at [51.6, 53.3, 55.0, 56.9, 58.9] (mm) according to the above design guidelines. Note that the detector planes are unequally space along the optical axis. Nevertheless, this spacing will provide optimal sampling of the light field for a fixed number of detector planes and the working range (scene depth) in this design example. This numerical example is presented to describe one particular embodiment that has proven effective and should be viewed as illustrating, rather than limiting, the present disclosure.
An example test of the normal case consists of two patterned disks at different depths, and the front disk transversely occludes part of the other. The lightfield camera is designed in accordance with the numerical example set forth above, and the two scene disks are placed accordingly as discussed below. Since lightfield is 4D data, it is challenging to visualize the data as a whole. One way to look at the lightfield data is through epipolar images, which are 2D slices of the 4D lightfield on either x−u or y−v directions. Due to the Lambertian/near-Lambertian nature of the scene, epipolar images usually consist of linear stripes with different slopes, in which the depth information is coded. First, one can minimize the cost function using conjugate gradient descent without regularization and then inspect the reconstructed epipolar images. The resulting images appear to be blurry, as can be seen in
It is interesting to see how the 4D TV works with another extreme case, which can be considered as the baseline case, and can be generated as the following: place the planar object at wb, which is sharply imaged at βαb such that the spectral line with slope αb/(1−αb) would lie in the middle of the two spectral sampling lines imposed by the d-th and (d+1)-th detectors, i.e., θw
Some portions of the above description present the techniques described herein in terms of algorithms and symbolic representations of operations on information. These algorithmic descriptions and representations are the means used by those skilled in the data processing arts to most effectively convey the substance of their work to others skilled in the art. These operations, while described functionally or logically, are understood to be implemented by computer programs. Furthermore, it has also proven convenient at times to refer to these arrangements of operations as modules or by functional names, without loss of generality.
Unless specifically stated otherwise as apparent from the above discussion, it is appreciated that throughout the description, discussions utilizing terms such as “processing” or “computing” or “calculating” or “determining” or “displaying” or the like, refer to the action and processes of a computer system, or similar electronic computing device, that manipulates and transforms data represented as physical (electronic) quantities within the computer system memories or registers or other such information storage, transmission or display devices.
Certain aspects of the described techniques include process steps and instructions described herein in the form of an algorithm. It should be noted that the described process steps and instructions could be embodied in software, firmware or hardware, and when embodied in software, could be downloaded to reside on and be operated from different platforms used by real time network operating systems.
The terminology used herein is for the purpose of describing particular example embodiments only and is not intended to be limiting. As used herein, the singular forms “a,” “an,” and “the” may be intended to include the plural forms as well, unless the context clearly indicates otherwise. The terms “comprises,” “comprising,” “including,” and “having,” are inclusive and therefore specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. The method steps, processes, and operations described herein are not to be construed as necessarily requiring their performance in the particular order discussed or illustrated, unless specifically identified as an order of performance. It is also to be understood that additional or alternative steps may be employed.
When an element or layer is referred to as being “on,” “engaged to,” “connected to,” or “coupled to” another element or layer, it may be directly on, engaged, connected or coupled to the other element or layer, or intervening elements or layers may be present. In contrast, when an element is referred to as being “directly on,” “directly engaged to,” “directly connected to,” or “directly coupled to” another element or layer, there may be no intervening elements or layers present. Other words used to describe the relationship between elements should be interpreted in a like fashion (e.g., “between” versus “directly between,” “adjacent” versus “directly adjacent,” etc.). As used herein, the term “and/or” includes any and all combinations of one or more of the associated listed items.
Although the terms first, second, third, etc. may be used herein to describe various elements, components, regions, layers and/or sections, these elements, components, regions, layers and/or sections should not be limited by these terms. These terms may be only used to distinguish one element, component, region, layer or section from another region, layer or section. Terms such as “first,” “second,” and other numerical terms when used herein do not imply a sequence or order unless clearly indicated by the context. Thus, a first element, component, region, layer or section discussed below could be termed a second element, component, region, layer or section without departing from the teachings of the example embodiments.
Spatially relative terms, such as “inner,” “outer,” “beneath,” “below,” “lower,” “above,” “upper,” and the like, may be used herein for ease of description to describe one element or feature's relationship to another element(s) or feature(s) as illustrated in the figures. Spatially relative terms may be intended to encompass different orientations of the device in use or operation in addition to the orientation depicted in the figures. For example, if the device in the figures is turned over, elements described as “below” or “beneath” other elements or features would then be oriented “above” the other elements or features. Thus, the example term “below” can encompass both an orientation of above and below. The device may be otherwise oriented (rotated 90 degrees or at other orientations) and the spatially relative descriptors used herein interpreted accordingly.
The foregoing description of the embodiments has been provided for purposes of illustration and description. It is not intended to be exhaustive or to limit the disclosure. Individual elements or features of a particular embodiment are generally not limited to that particular embodiment, but, where applicable, are interchangeable and can be used in a selected embodiment, even if not specifically shown or described. The same may also be varied in many ways. Such variations are not to be regarded as a departure from the disclosure, and all such modifications are intended to be included within the scope of the disclosure.
This application claims the benefit of U.S. Provisional Application No. 62/294,386 filed on Feb. 12, 2016. The entire disclosure of the above application is incorporated herein by reference.
This invention was made with government support under Grant Nos. ECCS0125446 and DMR1120923 awarded by the U.S. National Science Foundation. The Government has certain rights in this invention.
Number | Date | Country | |
---|---|---|---|
62294386 | Feb 2016 | US |