Super resolution using gaussian regression

Description

FIELD OF THE INVENTION

This invention relates generally to the field of image processing and in particular to a computer implemented method for converting a low resolution image into a high(er) resolution image using sparse approximate Gaussian process regression.

BACKGROUND OF THE INVENTION

Super-resolution is a technique used to convert a low resolution image into a higher resolution image. Applications of super-resolution include, for example, improving low resolution images/videos produced from devices such as cell phone cameras or webcams as well as converting normal quality video programs (i.e., NTSC) into higher quality (i.e., HDTV) programs.

Contemporary approaches to image super-resolution may be broadly divided into three categories namely: 1) functional interpolation; 2) reconstruction-based; and 3) learning-based.

Functional interpolation approaches apply an existing function to an image thereby obtaining a processed image. Functional interpretation approaches often blur discontinuities and do not satisfy reconstruction constraint(s). Reconstruction-based approaches generate a high resolution image from a low resolution image sequence. The generated image—while generally meeting reconstruction constraints—does not guarantee contour smoothness. Finally, learning-based approaches use high frequency details from training images to improve a low resolution image.

In a paper entitled “Video Super-Resolution Using Personalized Dictionary”, which appeared in Technical Report, 2005-L163, NEC Laboratories America, Inc., 2005, the authors M. Han, H. Tao, and Y. Gong described a learning-based super-resolution method namely an image hallucination approach consisting of three steps. In step 1, a low frequency image I_H^lis interpolated from a low resolution image I_L. In step 2, a high frequency primitive layer I_H^pis hallucinated or inferred from I_H^lbased on primal sketch priors. In step 3, reconstruction constraints are enforced thereby producing a final high resolution image I_H

According to the method disclosed, for any low resolution test image I_L, a low frequency image I_H^lis interpolated from I_Lat first. It is assumed that the primitive layer I_H^pto be inferred is a linear sum of a number of N high frequency primitives {B_n^h, n=1, . . . , N}. The underlying low frequency primitives are {B_n^l, n=1, . . . , N} in the I_H^l. Note that the center of each image patch is on the contours extracted in I_H^land the neighboring patches are overlapped.

A straightforward nearest neighbor algorithm is used for this task. For each low frequency primitive B_n^l, its normalized {circumflex over (B)}_n^lis obtained, then the best matched normalized low frequency primitive to {circumflex over (B)}_n^lin the training data is found and pasted to the corresponding high frequency primitive.

This method tends to encounter two undesirable issues. First, this approach is—in essence—a one-nearest-neighbor (1-NN) approach, which likely leads to an over-fit. When a similar primitive patch pair exists in the training data, the approach generates a very good result. Otherwise, the approach often generates a lot of artifacts.

Second, this method is not scalable. Though an AAN tree algorithm theoretically takes O(log(n)) time to find a best match, it unfortunately takes O(n) space. Cache size of modern computing machines is insignificant as compared to the size of an entire training data set. Consequently, video super resolution employing this method does not run satisfactorily in on-line environments.

SUMMARY OF THE INVENTION

An advance is made in the art according to the principles of the present invention directed to a computer implemented method of producing a super-resolution image from a lower resolution image. In sharp contrast to prior-art methods that employed nearest neighbor methodologies, the present invention advantageously utilizes a Gaussian regression methodology thereby eliminating two noted problems associated with the prior art namely artifacts and lack of scalability.

According to an aspect of the invention, a number of non-smooth low resolution patches comprising an image are found using edge detection methodologies. The low resolution patches are then transformed using selected basis of a Radial Basis Function (RBF) and Gaussian process regression is used to generate high resolution patches using a trained model. The high resolution patches are then combined into a high resolution image or video.

BRIEF DESCRIPTION OF THE DRAWING

A more complete understanding of the present invention may be realized by reference to the accompanying drawings in which:

FIG. 1 is a flowchart showing the steps associated with producing a super resolution image according to the present invention;

FIG. 2 is a flowchart showing the steps associated with producing a trained model used in the super resolution image production of FIG. 1;

FIG. 3 is an exemplary image showing super resolution according to the present invention;

FIG. 4 is an exemplary image showing super resolution according to the present invention;

FIG. 5 is an exemplary image showing super resolution according to the present invention;

FIG. 6 is an exemplary image showing super resolution according to the present invention; and

FIG. 7 shows details of the super resolution image of FIG. 5.

DETAILED DESCRIPTION

The following merely illustrates the principles of the invention. It will thus be appreciated that those skilled in the art will be able to devise various arrangements which, although not explicitly described or shown herein, embody the principles of the invention and are included within its spirit and scope.

Furthermore, all examples and conditional language recited herein are principally intended expressly to be only for pedagogical purposes to aid the reader in understanding the principles of the invention and the concepts contributed by the inventor(s) to furthering the art, and are to be construed as being without limitation to such specifically recited examples and conditions.

Moreover, all statements herein reciting principles, aspects, and embodiments of the invention, as well as specific examples thereof, are intended to encompass both structural and functional equivalents thereof. Additionally, it is intended that such equivalents include both currently known equivalents as well as equivalents developed in the future, i.e., any elements developed that perform the same function, regardless of structure.

Thus, for example, the diagrams herein represent conceptual views of illustrative structures embodying the principles of the invention.

Advantageously, the method of the present invention employs a regression approach instead of nearest neighbor to overcome the above mentioned issues.

Linear Regression

Instead of searching for Φ(B_i^h,B_i^l), we can approximate Pr(B_i^h|B_i^l) with a Gaussian distribution, N(f(B_i^l),Σ(B_i^l)) To simplify the notation, we only use one component of f. For the other component, they use the same formula.

Let x be the column vector of a B_i^l, y be the one component of B_i^h. If we assume ƒ(x)=w^Tx is a linear function,

y=ƒ(x)+ε=w^Tx+ε, (1)

where w is a parameter vector, and ε is iid noise, having distribution of N(ε|0,σ²).

We can add a constant component in vector x, if the bias is considered. We could then use the maximum likelihood principle to find the solution to the problem.

$\begin{matrix} \begin{matrix} \hat{w} = \underset{w}{argmin} \sum_{i} \frac{1}{2 σ^{2}} { y_{i} - w^{⊤} x_{i} }^{2} \\ = \underset{w}{argmin} \frac{1}{2 σ^{2}} { y^{⊤} - w^{⊤} X }^{2} \\ = {({XX}^{⊤})}^{- 1} Xy, \end{matrix} & (2) \end{matrix}$

where X is the matrix consisting of all column vector x, y is the vector consisting of corresponding y.

Those skilled in the art will readily appreciate that linear regression is not robust in some cases. To avoid over-fitting, we assume that the prior distribution, Pr(w), is a Gaussian, N(w|0,I_d), where d is the dimensionality of the feature space. Then the posterior distribution, Pr(w|X,y), is also a Gaussian, N(w|μ_w,Σ_w) where

$\begin{matrix} μ_{w} = {({XX}^{⊤} + σ^{2} I_{d})}^{- 1} Xy, Σ_{w} = {(\frac{1}{σ^{2}} {XX}^{⊤} + I_{d})}^{- 1} . & (3) \end{matrix}$

Therefore, the posterior distribution, Pr(y|x,X,y), is N(μ_y,Σ_y), where

μ_y=μ_w^Tx,
Σ_y=x^TΣ_wx+σ². (4)

The linear model can be implemented very efficient. At running time, only a matrix-vector multiplication (matrix-matrix multiplication in batch) is needed. However, the dimensionality of patches limits the capacity of learning, which means that even sufficient training patch peers are provided this method can not store enough patterns.

Gaussian Process Regression

Advantageously, to increase the learning capacity of the method, according to the present invention we employ Gaussian Processes Regression.

Let ƒ(x) is a Gaussian process having cov(ƒ(x), ƒ(x′))=k(x,x′), where k(x, x′) is a kernel function. Commonly used kernel functions include Radial Basis Function (RBF), where

$k (x, x^{'}) = \exp (- \frac{{ x - x^{'} }^{2}}{2 σ^{2}}) .$

Intuitively, we can treat ƒ(x) as a random variable, {ƒ(x_i)} as a multivariate random variable, exception that Gaussian process ƒ is an infinite dimensional one.

We use matrix X to represent N data points {x_i}, where each column of X is a data point. Let ƒ(X) be the vector of random variables {ƒ(x_i)}. We have cov(ƒ(X))=K, where K is the N×N Gram matrix of k and K_ij=k(x_i,x_j). We define α as a N dimension random vector such that i

ƒ(X)=α^TK (5)

Then Pr(α|X) follows Gaussian distribution, N(α|0,K⁻¹)

The posterior distribution of Pr(α|X,y) is also a Gaussian, N(μ_α,Σ_α), where

$\begin{matrix} μ_{α} = {(K + σ^{2} I)}^{- 1} y Σ_{α} = {(K + \frac{1}{σ^{2}} {KK}^{⊤})}^{- 1} . & (6) \end{matrix}$

Therefore, the distribution Pr(y|x,X,y) is N(μ_y,Σ_y), where

μ_y=y(K+σ²I)⁻¹k(X,x),
Σ_y=k(x,x)−k(X,x)^T(K+σ²I)⁻¹k(X,x)+σ². (7)

Sparse Approximate Gaussian Process Regression

When N is large, it is time consuming to compute k(X,x). Therefore, we use Sparse Approximate Gaussian Process Regression. In Sparse Approximate Gaussian Process Regression approaches, we are training to training ƒ(x) using a small number, M, of bases. Similar to Eq. 5, we let ƒ(X_M)=α^TK_MM. Then Pr(α|X) is Gaussian, N(α|0,K_MM⁻¹).

Then, the posterior distribution of Pr(α|X,y) is N(μ_α,Σ_α), where

$\begin{matrix} μ_{α} = {(σ^{2} K_{MM} + K_{MN} K_{MN}^{⊤})}^{- 1} K_{MN} y, Σ_{α} = {(K_{MM} + \frac{1}{σ^{2}} K_{MN} K_{MN}^{⊤})}^{- 1} . & (8) \end{matrix}$

We use M latent variable of ƒ(X_M) to infer the posterior distribution of ƒ(x). Therefore, the distribution Pr(y|x,μ_α,Σ_α) is N(μ_y,Σ_y), where

$\begin{matrix} μ_{y} = μ_{α}^{⊤} k (X_{M}, x), Σ_{y} = k (x, x) - {k (X, x)}^{⊤} (K_{MM}^{- 1} - Σ_{α}) k (X, x) + σ^{2} . & (9) \end{matrix}$

Note that

$K_{MM}^{- 1} - Σ_{α} = \frac{1}{σ^{2}} K_{MN} K_{MN}^{⊤} Σ_{α} .$

When M=N, the sparse result is identical to the ordinary Gaussian Process Regression.

Implementation—Training Phase

The selecting optimal bases is an NP problem. We can use any of a number of methods to select bases in a greedy way. Actually, a random selection of bases works well enough. To estimate the model of α as Eq. 8, we need to compute K_MMand K_MN. Since we can select M to meet the limitation of memory and CPU time, computing K_MMis not difficult. However, the number of training data, N, could be large, therefore storing K_MNcould be a problem. Instead of computing K_MN, we efficiently update K_MNK_MN^Tand K_MNy using rank-one update for each training example, or rank-k update for each batch of examples.

Implementation—Predicting Phase

The computation during the predicting phase includes kernel computation for k(x,X), and regression computation for μ_yand Σ_y. The operation during kernel computation depends on the type of the selected kernel. For RBF kernel, the main computation cost is exponential operation. The operation during regression computation is mainly the matrix-vector multiplication. We can replace a batch of matrix-vector multiplication with matrix-matrix multiplication, which usually speeds up the computation at the cost of memory.

In computing ƒ(x), the variance computation is relatively time-consuming, especially when M is greater than 2000. However, in our current approach, we don't need precise value of ƒ(x). To speedup the calculation, we can use low rank matrices to approximate K_MM⁻¹−Σ_α by C^TDC, where C is L×M matrix (L<M). The variance is computed as ∥D^1/2Ck(X_m,x)∥².

In predicting phase, the memory is mainly used for holding the model (selected bases and regression matrices) and the batch data. The size of the model depends on the number of bases we selected. For the 256 bases, the size is about 500 KB to 2 MB. When we set the batch size to 1024, the memory used is about 10 MB.

With these principles in place, we may now examine the overall steps in producing a super resolution image according to the present invention. With initial reference to FIG. 1, there is shown a flow chart depicting the overall method. More particularly, an image—which may advantageously be a still image or one or more frames of a video set of images, is input (step 101). A number of non-smooth low resolution patches comprising the image are found using edge detection methodologies step 102).

The low resolution patches are then transformed using selected basis of a Radial Basis Function (RBF) (step 103). A gaussian process regression is used to generate high resolution patches (step 104) using a trained model (step 106). As can be readily appreciated by those skilled in the art, our inventive method may be advantageously performed using multiple CPUs on a parallel system to process a number of patches in parallel (step 107) Finally, the high resolution patches are combined into a high resolution image or video (sep 108).

FIG. 2 is a flowchart depicting the generation of the trained model used in our super resolution method shown in FIG. 1. With reference to FIG. 2, an image or series of images (video) is input (step 201). The resolution of the input image(s)/videos are reduced (step 202) and a set of high/low resolution patch pairs is generated (step 203). The Gaussian process regression model is trained (step 205) and one or more RBF bases (step 204) are selected to produce the trained model (step 206). This trained model is used in step 106 of FIG. 1.

EXPERIMENTS

In the experiment of image super resolution, we did two-by-two time super resolution. This algorithm improved the mean square error by 10% comparing to the cubic interpolation algorithm. The input and output images are shown in FIG. 3. In the experiment, we use Creative WebCam Notebook to take the real time video input (160×120). The output is four-by-four time resolution of the input. We use DELL dual dual-core 2.8 GHz CPU to performance the super resolution. We can achieve about 10 fps. The input and output images are shown in FIG. 4, FIG. 5 and FIG. 6. A patch of FIG. 5 is shown in FIG. 7.

CONCLUSION

The method of sparse approximate Gaussian process regression speeds up the image super resolution process by thousand of times comparing to the original KD-tree approach. The method also reduces the MSE of produced image.

Accordingly, the invention should be only limited by the scope of the claims attached hereto.

Claims

1. A computer-implemented method for generating a super-resolution image from a low resolution image, said method comprising the steps of: determining a set of non-smooth low resolution patches of the low resolution image;generating—using Gaussian process regression—a set of high-resolution patches for the low resolution image;combining the high-resolution patches; andgenerating a high-resolution image from the combined patches.
2. The computer-implemented method of claim 1 further comprising the step of transforming the low-resolution patches using selected bases of radial basis function kernels.
3. The computer-implemented method of claim 2 further comprising the step of: training a model on the low-resolution image thereby generating a trained model.
4. The computer-implemented method of claim 3 wherein said training step comprises the steps of: generating a lower resolution image from the low-resolution image.generating a set of high/low resolution patch pairs;selecting a set of RBF bases; andtraining a Gaussian process regression model thereby generating the trained model.
5. The computer-implemented method of claim 4 whereby the transforming step and the Gaussian process regression step(s) are performed simultaneously for a number of patches.

US Referenced Citations (42)

Number	Name	Date	Kind
4797942	Burt	Jan 1989	A
5325449	Burt et al.	Jun 1994	A
5488674	Burt et al.	Jan 1996	A
5649032	Burt et al.	Jul 1997	A
5657402	Bender et al.	Aug 1997	A
5696848	Patti et al.	Dec 1997	A
5920657	Bender et al.	Jul 1999	A
5991444	Burt et al.	Nov 1999	A
5999662	Burt et al.	Dec 1999	A
6058248	Atkins et al.	May 2000	A
6075926	Atkins et al.	Jun 2000	A
6285804	Crinon et al.	Sep 2001	B1
6404516	Edgar	Jun 2002	B1
6459808	Brand	Oct 2002	B1
6717622	Lan	Apr 2004	B2
7003177	Mendlovic et al.	Feb 2006	B1
7006712	Silver et al.	Feb 2006	B1
7106914	Tipping et al.	Sep 2006	B2
7162098	Dugan et al.	Jan 2007	B1
7218796	Bishop et al.	May 2007	B2
7379611	Sun et al.	May 2008	B2
7379612	Milanfar et al.	May 2008	B2
7447382	Nestares et al.	Nov 2008	B2
7471851	Strelow	Dec 2008	B2
7477802	Milanfar et al.	Jan 2009	B2
7523078	Peot et al.	Apr 2009	B2
7526123	Moon et al.	Apr 2009	B2
7613363	Platt et al.	Nov 2009	B2
7715658	Cho et al.	May 2010	B2
7742660	Keshet	Jun 2010	B2
7778439	Kondo et al.	Aug 2010	B2
7809155	Nestares et al.	Oct 2010	B2
7840095	Yamada	Nov 2010	B2
7881511	Ye et al.	Feb 2011	B2
7889950	Milanfar et al.	Feb 2011	B2
20040170340	Tipping et al.	Sep 2004	A1
20040218834	Bishop et al.	Nov 2004	A1
20060159367	Zeineh et al.	Jul 2006	A1
20060291751	Milanfar et al.	Dec 2006	A1
20070041663	Cho et al.	Feb 2007	A1
20070041664	Yamada	Feb 2007	A1
20070071362	Milanfar et al.	Mar 2007	A1

Related Publications (1)

	Number	Date	Country
	20090274385 A1	Nov 2009	US

Super resolution using gaussian regression

Information

Patent Number

Date Filed

Date Issued

Inventors

Original Assignees

Examiners

Agents

CPC

US Classifications

Field of Search

US

International Classifications

Term Extension

Abstract

Description

Claims

US Referenced Citations (42)

Related Publications (1)