This invention relates generally to image processing, and more particularly to processing images using quadratic programming (QP).
As known in the art, a non-negative quadratic program (NNQP)
with vectors
x,hεn,
and a symmetric positive semi-definite matrix
Qεn×n,
where T is a transpose operator is ubiquitous in image processing applications, where x is a vector of non-negative quantities, such as light intensities, to be estimated under a squared error norm.
For example, Eq. (1) subsumes a non-negative least squares (NNLS) problem encountered in deblurring and super-resolution applications,
with
Q=A
T
A and h=ATb.
where A is a blurring or downsampling matrix, and b is a vector of pixel values. Eq. (1) is also the dual of a broader class of convex quadratic programs of the form
h are derived from vectors f, b are matrices A, and H via a dual transform.
Such quadratic programs are used in image labeling and image segmentation applications, and image operations that involve solving Poisson equation, e.g., matting. Solutions of Eq. (1) yield solutions of Eq. (3).
Eqs. (1-3) are well know in the prior art. State-of-the-art interior-point, active-set, and primal-dual algorithms offer linear-to-quadratic rates of convergence, but each iteration requires solution of a linear equation at least as large as Qy=z, which can take O(n3) time for a vector y of n values. This is impractical for image processing, where the number n of pixels in a single image can be 107 or larger, Obviously this is an extremely complex operation and cannot simply be performed.
Multiplicative update methods are suitable for very large problems. They allow rough estimates of the solution vector to be improved without solving linear equations. Some well-known examples are the Richardson-Lucy deblurring algorithm, the image-space reconstruction algorithm, and Lee-Seung non-negative matrix factorization algorithm. These will turn out to be special cases of the invention presented here.
However, those applications rely on some combination of strictly nonnegative coefficients, positive definiteness, or favorable initialization for convergence, if convergence is provable at all.
Many image processing and computer vision applications can be solved as a non-negative quadratic program.
The embodiments of the invention provide a convergent multiplicative update that has a simple form and is amenable to fine-grained data parallelism.
Conventional methods for deblurring, matrix factorization, and tomography are covered as special cases. The method can be used for image super-resolution, labeling and segmentation applications.
Particularly, a method performs an image processing application by expressing the image processing application as a non-negative quadratic program (NNQP) with a quadratic objective, and nonnegativity constraints, and expressing a Karush-Kuhn-Tucker condition of the NNQP as a fixpoint ratio.
Then, the fixpoint ratio is determined iteratively until a solution is reached with a desired precision.
Specifically, a method performs an image processing application by expressing the image processing application as a non-negative quadratic program (NNQP) with a quadratic objective, and nonnegativity constraints. A Karush-Kuhn-Tucker condition of the NNQP is expressed as a fixpoint ratio. Then, the fixpoint ratio is determined iteratively until a solution to the image processing application is reached with a desired precision.
We provide a multiplicative update method from Karush-Kuhn-Tucker (KKT) first-order optimality conditions of the Lagrangian function of Eq. (1)
L(x,λ)=1/2xTQx−h−λx
with
λεn>0.
These are
x∘∇
x
L=x∘(Qx−h−λ)=0
λ∘∇λL=λ∘x=0
with ∘ denoting the Hadamard element-wise product.
The update arises from a variational treatment of the above equation. We define the matrix split Q=Q+=Q− with Q+−max(Q, 0)+diag(r) and Q+=max(−Q, 0)+diag(r) for some r. Similarly, h=h+−h− with h+=max(h, 0)+s, h−=max(−h,0)+s.
Using the split, we can write
x∘∇
x
L=x∘((Q+x+h−)−(Q−x+h++λ))=0.
If all xi>0, possibly infinitesimally, then λi=0 and, element-wise,
x
i(Q+x+h−)i=xi(Q+x+h−)i.
Rearranging terms gives the multiplicative update rule
which by construction is stationary at an optimum x*. The right side of the rule can be determined in parallel for all x by a matrix-vector multiply, and a vectorized divide.
Image Processing Applications
Image Super-Resolution
Image super-resolution (SR) refers an image processing application where a set of low-resolution images are combined into a single high-resolution image. Each low-resolution image gk is assumed to be generated from an ideal high-resolution image f via a displacement matrix Sk, a decimation Dk matrix, and a noise process nk:
gk=D
k
S
k
f+n
k, for k=1,2, . . . ,K,
We use image registration to estimate the displacement matrices Sk. Then, we reconstruct the high-resolution image by iteratively solving the NNQP
This NNQP is converted in the form of Eq (1) and then solved using the above fixpoint ratio.
Image Labeling
In Markov random fields (MRF)-based interactive image segmentation applications, a small subset of pixels is labeled, and the MRF propagates the labels across the image, typically finding high-gradient contours as segmentation boundaries where the labeling changes.
This problem is combinatorial when there are more than two kinds of labels, and usually approximately optimized by network flow process. The problem can also be expressed as a NNQP. Let r be a pixel in the image or the region of interest Ω={ri:i=0, 1, . . . , N}. The class set is denoted by K={1, 2, . . . , K}. Then, the probabilistic segmentation approaches compute a probability measure field x={xk(r):rεΩ,kΩK}, such that
Σk=1Kxk(r)=1,xk(r)≧0
The set of neighbors pixel r is denoted as Nr={sεΩ:|r−s|=1}.
The QP has the form
the cost drk, of assigning label k to pixel r is obtained from a classifier that is trained with a small set of labeled, using e.g., a support vector machine (SVM), which is trainable using a multiplicative update rule described below.
The quadratic term controls the granularity of the regions, i.e., promotes smooth regions. The spatial smoothness is globally controlled by the positive parameter η, and locally controlled by weight ωrs≈1, when the neighboring pixels r and s are likely to belong to the same class, and ωrs≈0 otherwise. We set ωrs to the cosine of the angle between the vectors of two pixels in a LAB color space, which approximates the color response of human vision.
This QP is a convex relaxation of a combinatorial {0,1}-labeling problem. We solve this QP by putting it in dual form to obtain an NNQP of the form of Eq. (1), then iteratively applying the fixpoint to find the optimal dual vector. The dual vector is then transformed back into the primal vector x using the standard Lagrange dual transformation for quadratic programs. As an alternate solution method, one can derive multiplicative update rules directly from a Lagrangian function, as shown in the next paragraph.
Here the first-order KKT optimality conditions
yield a two-step update for primal and dual parameters
Depending on the size of the image and the number of classes, the MRF QP has 1-2 million variables and converges in 5-6 seconds on a serial processor.
Analysis
We consider here the behavior of the NNQP fixpoint ratio
as a multiplicative update for positive vectors x≠x*.
The bracketed fixpoint ratio is always non-negative, and positive if si>0 or rixi>0). Thus, Eq. 1 preserves the feasibility of x, implicitly satisfying the second KKT condition.
As is the case for gradient descent methods, the fixpoint moves xi in the direction opposite the partial derivative ∂xiL(x, λ).
Indeed, iteration of the rule from any x>0 solves Eq. (1) and, by extension, Eq. (2-3). More precisely, for any Q0 (positive semi-definite), there is a vector r for which the fixpoint rule converges monotonically and asymptotically from any positive x>0 to the optimum of Eq. (1).
The value r guarantees that the update gives a contraction even in the nullspace of semidefinite Q. There are many satisfactory choices of r, for example,
r
i=max(ΣjQ−ij,Qii).
For many strictly positive or strictly convex image processing problems, it suffices to have r=0. Convergence is asymptotic; if x*i=0, we obtain xi→0 on an ideal infinite-precision computer). On real computers, which use floating point numbers, we see xi rapidly converge to x*, regardless of its value.
The asymptotic convergence rate is linear. Asymptotically, each iteration of the update rule improves the estimate {circumflex over (x)} by a constant number of bits of precision. For the sparse Q matrices typically encountered in image processing applications, each update takes linear serial time or constant parallel time, and a linear number of updates are needed to achieve any fixed precision result. Thus, the multiplicative update rule combines the simplicity and processing speed of gradient descent with the feasibility and convergence guarantees of modern quadratic programming algorithms.
As such, it is particularly well suited for implementation on low-cost parallel processing arrays, such as single instruction, multiple data (SIMD) processor or a graphic processing unit (GPU) in a cellular telephone, especially because Q typically has low bandwidth in image processing applications, so that the update requires very limited communication between processors. One GPU implementations solves a quadratic programs of 105 variables in a fraction of a second.
Image Processing Applications
General
As shown in
Super-Resolution Image Processing Application
As shown in
Image Processing Application for Labeling
As shown in
Image Processing Application for Deblurring
As shown in
By expressing the Karush-Kuhn-Tucker (KKT) first-order optimality conditions as the fixpoint ratio, we obtained a multiplicative fixpoint that can solve very large quadratic programs on low-cost parallel processors.
The update converges from all positive initial guesses. We also proved asymptotic linear convergence rate. Although QP problems are ubiquitous in image processing, QP methods are infrequently used because the QPs themselves are prohibitively large.
Our multiplicative update makes the QP approach viable, with one method solving many problems. Moreover, the method is well suited for single instruction, multiple data (SIMD), and multiple instruction, multiple data stream (MIMD) parallel processors.
Our method can be used for super-resolution, image segmentation, deblurring, matting, denoising, synthesis, and tomography applications.
Although the invention has been described by way of examples of preferred embodiments, it is to be understood that various other adaptations and modifications can be made within the spirit and scope of the invention. Therefore, it is the object of the appended claims to cover all such variations and modifications as come within the true spirit and scope of the invention.