The following relates to systems and methods for solving inverse imaging problems.
Over the past few decades, camera optics have become increasingly complex. For example, the lenses of modern single lens reflex (SLR) cameras may contain a dozen or more individual lens elements, which are used to optimize light efficiency of the optical system while minimizing aberrations, i.e., non-linear deviations from an idealized thin lens model.
Optical aberrations include effects such as geometric distortions, chromatic aberration (wavelength-dependent focal length), spherical aberration (where focal length depends on the distance from the optical axis), and coma (angular dependence on focus). Single lens elements with spherical surfaces typically suffer from these artifacts, and as a result may not be used in high-resolution, high-quality photography. Instead, modern optical systems often feature a combination of different lens elements with the intent of canceling out aberrations. For example, an achromatic doublet is a compound lens made from two glass types of different dispersion, i.e., their refractive indices depend on the wavelength of light differently. The result is a lens that is (in the first order) compensated for chromatic aberration, while still suffering from the other artifacts mentioned above.
Despite typically having better geometric imaging properties, modern lens designs are often not without disadvantages, including a significant impact on the cost and weight of camera objectives, as well as increased lens flare.
That is, modern imaging optics can be highly complex systems with up to two dozen individual optical elements. This complexity is normally required in order to compensate for the geometric and chromatic aberrations of a single lens, including geometric distortion, field curvature, wavelength-dependent blur, and color fringing.
There is provided a method for solving inverse imaging problems to compensate for distortions in an image, the method comprising: minimizing a cost objective function containing a data fitting term and one or more image prior terms to each of the plurality of channels, the one or more image prior terms comprising cross-channel information for a plurality of channels derived from the image.
There is also provided a computer readable medium comprising computer executable instructions for solving inverse imaging problems to compensate for distortions in an image, comprising computer executable instructions for: minimizing a cost objective function containing a data fitting term and one or more image prior terms to each of the plurality of channels, the one or more image prior terms comprising cross-channel information for a plurality of channels derived from the image.
There is also provided an electronic device comprising a processor and memory, the memory computer executable instructions for solving inverse imaging problems to compensate for distortions in an image, comprising computer executable instructions for: minimizing a cost objective function containing a data fitting term and one or more image prior terms to each of the plurality of channels, the one or more image prior terms comprising cross-channel information for a plurality of channels derived from the image.
Embodiments will now be described by way of example with reference to the appended drawings wherein:
The following describes a system and method utilizing computational manipulations of images captured from a camera to remove artifacts and allow for post-capture correction of the images captured through uncompensated, simple optics which are lighter and typically less expensive.
It has been found that while the focusing of different colors through a lens typically results in focusing at different locations, a particular color channel typically focuses in the correct location and has commonality with at least one other color channel. By sharing information between color channels, images can be improved using computational optics, even using simpler lens designs.
In one example, the system can estimate per-channel, spatially-varying point spread functions, and solve non-blind inverse imaging problems (e.g., de-convolution) with a cross-channel analysis. The system may also be configured to jointly apply inverse imaging while considering image information of other channels when modifying only one of the colour channels. The method of jointly solving inverse imaging problems can be advantageous as it is designed to specifically eliminate color fringing. It can be appreciated that for the purposes of illustrating the principles herein, the terms “de-convolution”, “inverse imaging”, and “solving inverse imaging problems” may be used interchangeably and that other inverse imaging problems (other than de-convolution) are applicable. It can also be appreciated that while the examples described herein refer to inverse imaging of images, the principles equally apply to frames of a video, which may also be considered “images” for this purpose.
There is also provided a computational approach for post correction of images captured from a camera, the images being corrected for aberrations occurring for example during the image capturing process due to limitations from the camera or lens. Accordingly, the proposed system and method presents an alternative approach to enabling high-quality photography. That is, instead of ever more complex optics, the system and method allows computational and automatic or semi-automatic (e.g. with some user input) correction of aberrations thereby allowing the capturing of high quality images, even with poorly performing lenses.
The methods and systems described herein, which exploit cross-channel information, can be configured to be more robust than existing correction methods. The systems and methods according to one example is able to handle much larger and more dense blur kernels, such as disk-shaped kernels with diameters of 50 pixels and more, which occur frequently in uncorrected optics unless the apertures are stopped down to an impractical size. The present system and method may also include a convex solver with convergence properties that can minimize the resulting cross-channel inverse imaging problem (e.g. utilization of channel information from other channels while adjusting one channel).
In one aspect, the components used enable the use of simple lens designs for high-quality photography, for example: a new cross-channel prior for color images, a modification of the cross-channel prior for dark image regions, an efficient determination of the cross-channel prior in an inverse imaging framework and a convex solver for converging to a global optimum, a robust approach for per-channel spatially-varying PSF estimation using a total variation (TV) prior based on the same optimization framework, noise patches in a grid with padding for spatially varying PSF estimation, and the extension and application of the cross-channel prior to multi-spectral images.
Turning now to the figures,
The electronic device 10 may include or otherwise have access to image data 18, e.g., a memory card or other data storage mechanism, which can store captured or received images prior to processing, as well as processed images prior to displaying or sending. The image correction module 12 is configured to receive or obtain a captured image 14 and generate a corrected image 20. The corrected image 20 can be sent to another device, e.g., via a communication interface 22 such as a physical or wireless connection, or can be displayed on a display 24 when the electronic device 10 has such capabilities as shown in dashed lines in
The imaging device 16, 16′ may be a camera having a single lens element which suffers from various artifacts. However, it can be appreciated that the principles discussed herein may equally apply to other lens and camera configurations that result in aberrations.
The image correction module 12 includes or otherwise has access to programmed instructions or modules for performing a joint inverse imaging of image channels. Such a joint inverse imaging functionality may comprise computer executable instructions that are configured to obtain a digital image 14 captured from the imaging device 16 or stored on in the image data 18. Further information can be stored on the electronic device 10 and be made accessible to the image correction module 12 (and inverse imaging functionality) such as the type of camera used, lens characteristics, and sensor characteristics, to name a few.
Turning now to
Turning now to
As illustrated in
As shown in
Once the iterative process is repeated for all channels, the restored channel information from all channels is therefore recomposed to output a reconstructed image having minimized distortions. For example, the resultant corrected image 20 may be sent to an external device for displaying, or be displayed directly by the electronic device on the display 24. The joint inverse imaging functionality is advantageously based upon the minimization values discussed further below, with respect to the image prior and cross-channel prior terms. That is, the image prior and cross-channel prior terms can be pre-stored or be otherwise accessible to the electronic device 10. Typically, the image prior information 114 provides a representation that captures prior knowledge about image statistics and patterns. The image prior information 114 can also include camera specific information regarding, for example, types of aberrations and location of aberrations that are expected with a specific camera or associated sensors.
Making reference now to
In one example, to solve both the image inverse imaging and the PSF estimation problem for working with simple lenses, optimization methods may be derived based on the optimal first-order primal-dual framework (e.g., by Chambolle and Pock, T. 2011, “A first order primal dual algorithm for convex problems with applications to imaging, J. Math. Imaging Vis. 40, 120-145).
The optimization framework considers general problems of the form:
For example, in an inverse problem with a TV regularizer, the first term in Eq. (1) is the regularizer (that is K(x)=∇x, F(y)=∥y∥1 for TV), while the second term is the data fitting term (some residual norm).
Let X, Y be finite-dimensional real vector spaces for the primal and dual space, respectively. The operators from Eq. (1) are then formally defined as:
K: X→Y is a linear operator from X to Y
G: X→[0, +∞) is a proper, convex, (I.s.c.) function.
F: Y→[0, +∞) is a proper, convex, (I.s.c.) function.
where I.s.c. stands for lower-semicontinuous. The dual problem of Eq. (1) is given as
where * denotes the convex conjugate. To solve the above (primal and dual) problem, the following algorithm is proposed by Chambolle and Pock (2011):
A
Choose initial iterates (x0,y0)εX×Y,
The resolvent or proximal operator with respect to G is defined as:
and analogously for proxσF:=(Π+σ∂F)−1. The parameter L, which is necessary to compute valid τ, σ, is defined as the operator norm L=∥K∥2.
Now addressed is a specific and efficient inverse imaging method based on the general optimization framework. There is provided an image formation model, and a cross-channel prior for de-convolving multichannel images. In one aspect, there is provided an optimization method for inverse imaging with this term, and casting it into the framework from the previous section. In one aspect, the method further comprises a process for dealing with dark image regions. The method allows integration into an efficient scale-space inverse imaging method.
Consider a grayscale image tile captured with n×m resolution. Let J, I, NΔ be the observed captured image, the underlying sharp image and additive image noise, respectively. The formation of the blurred observation with the blur kernel B can then be formulated as:
J=B
I+N (6)
j=Bi+n (7)
In the second form, B, j, i and n are the corresponding quantities in matrix-vector form. As mentioned above, an actual digital image 14 will be composed of many tiles 60, each with a PSF that is assumed constant over the tile 60.
2.2 Inverse Imaging with Cross-Channel Information
Real optical systems typically suffer from dispersion in the lens elements, leading to a wavelength dependency of the PSF known as chromatic aberration. While complex modern lens assemblies are designed to minimize these artifacts through the use of multiple lens elements that compensate for each other's aberrations, it may be noted that even very good lenses still have a residual amount of chromatic aberration. For the simple lenses considered herein, the chromatic aberrations can be very severe—i.e. where one color channel is focused significantly better (although never perfectly in focus) than the other channels, which are blurred beyond recognition (excluding achromatic lenses which compensate for chromatic aberrations).
Given individual PSFs B{1 . . . 3} for each color channel j{1 . . . 3} an image J attempting to independently de-convolve each color channel does not in general produce acceptable results, since frequencies in some of the channels may be distorted beyond recovery as shown in FIG. 6.
The technique described herein allows for the sharing of information (e.g. effect of aberrations, amount of distortion, location of distortion, type of distortion, and channel of distortion) between the inverse imaging processes of the different channels, so that frequencies preserved in one channel can be used to help the reconstruction of the captured image in another channel. In one example, the cross-channel prior is based on the assumption that edges in the image appear in the same place in all channels, and that hue changes are sparse throughout the image (see also
These assumptions lead to the following prior for a pair of channels:
∇ik/ik≈∇il/il
∇ik·il≈∇il·ik (8)
which is enforced in a sparse (l1) fashion. It may be noted that the division and multiplication /, · are pixel-wise operators in this example.
Using the cross-channel prior the problem of jointly solving the inverse imaging problem for all channels can be formulated as the optimization problem:
where the first term is a standard least-squares data fitting term, and the second term enforces a heavy-tailed distribution for both gradients and curvatures. The convolution matrices H{1,2}, implement the first derivatives, while H{3 . . . 5} correspond to the second derivatives. An l1 norm is employed in this method rather than a fractional norm. This ensures that the problem is convex. The last term of Eq. (9) implements the cross-channel prior, again with a l1 norm. λc, βclε with c, lε{1 . . . 3} are weights for the image prior and cross-channel prior terms, respectively.
The minimization from Eq. (9) can be implemented by alternately minimizing with respect to one channel while fixing all the other channels. To optimize for this single channel x=ic a first-order primal-dual algorithm is derived adopting the framework described in Sec. 1.0.
First, the optimization (which may be used in the process shown in
where here D denotes the diagonal matrix with the diagonal taken from the subscript vector. S is a matrix consisting of the sequence of all t=5+2(3−1) matrices coming from the l1 minimization terms in Eq. (9). By comparison with Eq. (1), the following can now define:
K(x)=Sx
F(y)=∥y∥1
G(x)=∥Bcx−jc∥22 (11)
Given this structure, the following resolvent operators used to apply Algorithm 1 are:
where F( ) in the last line denotes the Fourier transform and {tilde over (x)}, {tilde over (y)} are the function variables of the proximal operators as defined in Eqn. (4). The first proximal operator is a per-pixel projection operator. The second proximal operator is the solution to a linear system as shown in the second line. Since the system matrix is composed of convolution matrices with a large support this linear system can be efficiently solved in the Fourier domain (last line).
Thus, the convex conjugate K* of the linear operator K, which is given as follows:
where t is the number of matrices S is composed of. In summary, the matrix-vector multiplication K*(y) in Algorithm 1 can be expressed as the following sum:
K*(y)=STy=Σk=1tSkTy[(k−1)·nm, . . . ,k·nm−1] (15)
where each SkT is just a sum of filtering operations and point-wise multiplications. Likewise, the resolvent operators given above can be implemented using small filtering operators or the FFT for larger filters.
Parameter Selection
Algorithm 1 converges to the global optimum of the convex functional if θ=1, τσL2<1 with τ,σ>0 and L=∥K∥2.
θ=1,
are used for the inverse imaging method described here. Next, consider how to compute the operator norm L. Since we have K(x)=Sx where S was a matrix, ∥K∥2 is the square root of the largest eigenvalue of the symmetric matrix STS. The value L can be found by using the power iteration where again all matrix-vector-multiplications with STS can be decomposed into filtering operations.
In yet another aspect, there is provided a modification of the basic cross-channel prior that produces improved results in dark regions, i.e., for pixels where all channel values approach zero. In these regions, the prior from Eq. (8) may not be effective, since the hue normalization reduces the term to zero. As a result, significant color artifacts (such as color ringing) can remain in dark regions. Note that by allowing luma gradients in the original cross-prior an inherent design problem of this prior can be introduced, rather than an optimization issue. In these regions, it is proposed to match absolute (rather than relative) gradient strengths between color channels. The operator G from Eq. (11) is modified as:
G(x)=∥Bcx−jc∥+λbΣl≠cΣa=12∥Dw(Hax−Hail)∥22 (16)
where Dw is a spatial mask that selects dark regions below a threshold ε. The mask is blurred slightly with a Gaussian kernel Kσ to avoid spatial discontinuities at the borders of regions affected by the additional regularization term:
with ε=0.05 and σ=3 in our implementation.
The resolvent operator with respect to G from Eq. (13) is replaced by the following derivation:
u
opt=proxτG({tilde over (u)})
[2τBcTBc+Π+2τλbΣl≠cΣa=12HaTDw2Ha]uopt=2τbcTjc+ũ+2τλbΣl≠cΣa=12HaTDw2Hail
Au
opt
=b (17)
This expresses the solution of the resolvent operator as the matrix inversion problem Auopt=b. Since blur kernel sizes of the order of magnitude of 102×102 can be expected for practical applications, A is very large and impractical to invert explicitly. The system is solved using the Conjugate Gradient (CG) algorithm. This allows the matrix-vector multiplication to be expressed in the CG-algorithm as a sequence of filtering operations as before.
In one embodiment, the method handles saturation in the captured image by removing the rows where j is saturated from the data fitting term. This is done by pre-multiplying the residual Bcx−jc with a diagonal weighting matrix whose diagonal is 0 for saturated rows and 1 else; the derivation from Eq. (17) is changed straightforwardly.
An iteration is shown in is the inverse imaging of the image to the left using the kernel to the right and a regularizer-weight λk. The algorithm may be performed in scale space for performance reasons.
To increase the performance of the algorithm by using good starting points for Eq. (10), the method is performed in scale space. The pyramids {Ĵz}_(z=0)̂Z,{B̂z}_(z=0)̂Z of the captured image/kernel pairs are computed by bicubic downsampling of J,B with the scale factor ½. The reconstruction pyramid {Îz}_(z=0)̂Z is progressively recovered from coarse (scale 0) to fine, where at each scale the initial iterate is the up-sampled result of the next coarser level. Note, that our scale space implementation purely serves as a performance improvement. In particular we do not need to overcome local minima in our minimization problem since it is convex. Contrary to other methods we are not using any information from coarser scales in the image correction at a considered scale. Since the reconstructions can contain significantly less detail we found that guiding fine scale image correction with coarser scale information is problematic in many cases.
As discussed above, to address low-intensity areas in an image, it has been found that an offset can be applied to bring the low-intensity values into a suitable range for processing.
Let B, as defined above in Eq. (6), be assumed to be energy preserving (i.e. the sum of all entries equals 1). Let {circumflex over (1)} be defined as the constant image in n×m containing the positive constant value α at every pixel. Therefore (letting the relevant terms be as defined in Eq. 6 and Eq. 7):
Step 2 follows under the standard convolution operator because B is normalized to 1 (energy preserving) and {circumflex over (1)} is constant.
Section 2.5 above introduces modifications to the standard method above that deal with low-intensity image regions. The benefit of this formulation is that there are no longer any low-intensity regions to contend with. For example, assuming original image values between 0 and 1 and α=1, then after offsetting, all image values are now between 1 and 2, and therefore the standard method developed in section 2.4 can be applied with minimal modifications. The new procedure is
a. Let Ĵ=J+α
b. Apply the procedure from section 2.4 to Ĵ, generating intermediate output image Î
c. Subtract 1 to produce the final output image, I=Î−α
The previous section assumes that the PSF of the optical system is given for each image tile 60. In one aspect, there is utilized a consumer laser-printer to make a target pattern. To estimate the PSFs from the target images, it is natural to apply the same optimization framework that was used for image correction also for the PSF estimation step. This method is detailed below.
The PSF estimation problem can be posed as a inverse imaging problem, where both a captured image 14 and a sharp image of the same scene are given. The captured image 14 is the scene imaged through the simple lens, with the aperture open, while the sharp image can be obtained by stopping the lens down to a small, almost pinhole aperture, where the lens aberrations no longer have an effect. By acquiring a sharp image this way (as opposed to a synthetic sharp image) we avoid both geometric and radiometric calibration issues in the sharp reference image.
Let J be an image patch in a considered blurred channel, I the corresponding sharp pinhole-aperture patch. We estimate a PSF Bopt describing the blur in J by solving the minimization problem:
where the first term is a linear least-squares data fitting term, and the scalar s=Σkk,lI(k,l)/Σk,lJ(k,l) accounts for the difference in exposure between the blurred and pinhole image. The second term represents a standard TV prior on the gradients of the recovered PSF, and the third term enforces an energy conservation constraint, i.e., Σk,lB(k,l)=1.
It may be note that Eq. (18) is a convex optimization problem. We derive a first-order primal-dual algorithm adopting the framework described in Sec. 2. Specifically, Eq. (18) is expressed using the following operators adopting the notation from before:
The following resolvent operators and convex conjugates used to apply Algorithm 1 are then provided:
However, the resolvent operator with respect to G(u) has to be derived for this problem. This can be expressed as a sequence of frequency domain operations:
where O is a convolution matrix consisting only of ones.
Using these operators, can again apply Algorithm 1 can again be applied. The computation of L has been described in Sec. 2.4 and the same τ and σ can be used.
To allow for robust PSF estimation, the scene used for this purpose should have a broad spectrum. A white noise pattern can therefore be adopted. One specific pattern, shown in
To increase the efficiency of the PSF estimation, the above method can be applied in scale space by initializing the iterative minimization at each scale with the upsampled results from the next coarser scale, which yields a good starting point for the convex objective function from Eq. (18) and thus speeds up the minimization.
After the initial PSF estimation, an additional step of smoothing is performed by computing weighted averages of PSFs for a 3×3 set of neighboring tiles. Although the PSFs may contain high frequency features, these tend to change smoothly over the image plane. Combined with the relatively small tile size, it has been found that this spatial filtering does not cause feature loss, but can reduce noise significantly.
The aberration correction method described herein cannot only be used for regular RGB cameras but also for multispectral imagers. In this case, the cross-channel prior is applied for all pairs of frequency bands. As with the conventional cameras, the method can successfully remove chromatic aberration and restore lost image detail in the blurred channels. Considering the wavelength-dependent PSFs here, it should be noted that the assumption of fixed PSFs for each color-channel of an RGB-sensor is often violated. This assumption is made for all the RGB sensor results in this paper and is a classical assumption in inverse imaging literature. However, one typically cannot tell from a tri-chromatic sensor the exact wavelength distribution.
Metamers (different spectra that produce the same tri-stimulus response) will have different blur kernels, so there can be situations where the assumption of fixed per-channel PSFs fails, such as for example light sources with multiple narrow spectra or albedos with very narrow color tuning. This introduces errors in the data-fitting term of the present objective function. Since the system has a strong cross-channel and image prior images can still be constructed with high quality.
It will be appreciated that any module or component exemplified herein that executes instructions may include or otherwise have access to computer readable media such as storage media, computer storage media, or data storage devices (removable and/or non-removable) such as, for example, magnetic disks, optical disks, or tape. Computer storage media may include volatile and non-volatile, removable and non-removable media implemented in any method or technology for storage of information, such as computer readable instructions, data structures, program modules, or other data. Examples of computer storage media include RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disks (DVD) or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by an application, module, or both. Any such computer storage media may be part of the electronic device 10, any component of or related to the electronic device 10 (e.g. image correction module 12), etc., or accessible or connectable thereto. Any application or module herein described may be implemented using computer readable/executable instructions that may be stored or otherwise held by such computer readable media.
It will also be appreciated that the examples and corresponding diagrams used herein are for illustrative purposes only. Different configurations and terminology can be used without departing from the principles expressed herein. For instance, components and modules can be added, deleted, modified, or arranged with differing connections without departing from these principles.
The steps or operations in the flow charts and diagrams described herein are just for example. There may be many variations to these steps or operations without departing from the spirit of the invention or inventions. For instance, the steps may be performed in a differing order, or steps may be added, deleted, or modified.
Although the above has been described with reference to certain specific examples, various modifications thereof will be apparent to those skilled in the art.