The invention is made in the field of image processing. In particular, the invention is made in the field of image processing by regularization of total variation.
Total Variation (TV) is a widely-used measure for intensity continuity of images. It has been applied in many applications such as image restoration, deconvolution, decompression, inpainting, etc.
For instance T. Chan and C. Wong, “Total Variation Blind Deconvolution”, IEEE Transactions on Image Processing, 7(3), 370-375 (1998), describe use for blind deconvolution and F. Guichard and F. Malgouyres, “Total Variation based interpolation”, Proc. European Signal Processing Conf., 3, 1741-1744 (1998), use for resolution enhancement. Another use case is decompression described by F. Alter, S. Durand, J. Froment, in “Adapted Total Variation for Artifact Free Decompression of JPEG Images”, J. Math. Imaging and Vision, 23(2), 199-211 (2005).
In particular, TV denoising is remarkably effective at simultaneously preserving edges while removing noise in flat regions, which is a significant advantage over the intuitive techniques such as linear smoothing or median filtering. The idea is based on the principle that signals with excessive and possibly spurious detail have high total variation, that is, the integral of the absolute gradient of the signal is high.
According to this principle, reducing the total variation of the signal subject to it being a close match to the original signal, removes unwanted detail whilst preserving important details such as edges.
Typically, TV is calculated with the horizontally and vertically gradient images. Denote an image by I, its horizontally and vertically gradient images, ∇xI and ∇yI are defined as
∇xI=I(x+1,y)−I(x,y) and ∇yI=I(x,y+1)−I(x,y).
Then TV is calculated, wherein sqrt(•) calculates the square root of its argument, by:
TV(I)=Σi,jsqrt(∇xI(i,j)2+∇xI(i,j)2) or (1)
TV(I)=Σi,j(/∇xI(i,j)/+/∇xI(i,j)/) (2)
Classical TV denoising tries to minimizes the Rudin-Osher-Fatemi (ROF) denoising model:
minfTV(f)+λ*(∥f−n∥2)2/2 (3)
where n is the noisy image, TV(f) is the total variation of f, and λ is a parameter which controls the denoising intensity.
The idea of TV regularization is increasingly employed in compressive sensing recently. For instance, it is proposed to recover images by a few samples based on the following equation, where Φ is some certain sampling matrix and y is the obtained sample.
minfTV(f)+λ*(∥y−Φf∥2)2/2 (4)
As can be seen in Eq. (3), traditional TV regularization does not consider the content of images, it simply smoothes the entire image with equivalent intensity from both horizontal and vertical direction. Therefore, the edges are smoothed more or less after TV denoising, especially the oblique edges. As a conclusion, the gradients along horizontal and vertical direction are not robust enough for various images. X. Shu and N. Ahuja, “Hybrid Compressive Sampling via a New Total Variation TVL1”, Proc. ECCV'10, 393-404 (2010), propose a so called TV11 for compressive sampling. TV11 calculation is based on the horizontal and vertical gradients, and in addition, two diagonal partial gradients, ∇x∇yI(i,j) and ∇y∇xI(i,j) to enforce the diagonal intensity continuity.
Depending on the type of image or video content, edges within images follow different distributions. Averaging over the different content types, edges are randomly oriented and the inventors found the four directions of X. Shu and N. Ahuja still not being enough for randomly orientated edges.
Thus, the invention addresses the problem that traditional image processing by regularization of Total Variation (TV) only enforces the horizontal and vertical intensity continuity and thus fails to reconstruct oblique edges well.
In an embodiment, Directional Total Variation is defined which supports multiple gradient directions. It first pre-processes the image and determines the direction of edges and/or texture; Directional TV is calculated based on the gradients along the determined direction and its orthogonal direction. By utilizing adaptive weights to different directions in the regularization, Directional TV is capable of preserving edges independent from their orientation nicely. Thus, Directional TV regularization based image denoising, compression or super resolution tend to obtain better quality.
In further embodiments, the invention comprises the following features, alone, pair-wise combined or all together:
Computation of Directional Total Variation occurs by the gradient along the edge and its orthogonal direction.
Since direction is consistent in a small patch, pre-processing divides the image into small patches and checks a number of predefined directions for each patch. Then at least one predominant direction in the patch is determined, i.e. at least one direction is determined which is most likely to be along the edge.
There are various embodiments in which different techniques are used for choice of a predominant direction in the patch. For instance, there is an embodiment comprising calculating the ratio of the energy sum of gradients to that of its orthogonal direction, and choose the direction with the largest ratio.
This embodiment can be refined by determining the direction as being the one most similar to its upper and left neighbour blocks in case of flat regions, determined, e.g., by determining that the energy sums of gradients along multiple (or all) directions are equivalent.
Alternatively or additionally, the weights of the two orthogonal directions can be determined adaptively. Generally, the direction along the edge is given large weight and its orthogonal direction is given small weight. The weights are determined in the pre-processing, e.g., based on the ratio of energy sums.
Since there are applications where it may occur that some of the gradients are unavailable for some patches, Directional TV is calculated by the mean value of the available gradients scaled adequately.
A device for processing of an image comprises means for pre-processing the image for determining at least one predominant direction of at least one of edges and texture, means for determining a total variation of the image using a weighted sum of variations along the at least one predominant direction and along a direction orthogonal to the at least one predominant direction, and means for processing the image using regularization of the total variation.
The features of further advantageous embodiments are specified in the dependent claims.
Exemplary embodiments of the invention are illustrated in the drawings and are explained in more detail in the following description. The exemplary embodiments are explained only for elucidating the invention, but not for limiting the invention's disclosure or scope defined in the claims.
In the figures:
The invention may be realized on any electronic device comprising a processing device that may be correspondingly adapted. For instance, the invention may be realized in a television set, a mobile phone, a personal computer, a digital still camera, a digital video camera, a navigation system or a car video system.
In an exemplary realization, the invention comprises direction determination and regularization of total variation with respect to the determined direction. In an embodiment, the direction is determined among eight predetermined direction candidates similar to those used for intra prediction in H.264 standard. The eight predetermined gradient directions are exemplarily depicted in
The eight predetermined directions of the exemplary realization are defined as follows:
∇aI=I(x,y)−I(x−1,y) (5a)
∇bI=I(x,y)−I(x−2,y−1) (5b)
∇cI=I(x,y)−I(x−1,y−1) (5c)
∇dI=I(x,y)−I(x−1,y−2) (5d)
∇eI=I(x,y)−I(x,y−1) (5e)
∇fI=I(x,y)−I(x+1,y−2) (5f)
∇gI=I(x,y)−I(x+1,y−1) (5g)
∇hI=I(x,y)−I(x+2,y−1) (5h)
Or, when taking distances of pixels into consideration:
∇aI=(I(x,y)−I(x−1,y)) (6a)
∇bI=(I(x,y)−I(x−2,y−1))/√5 (6b)
∇cI=(I(x,y)−I(x−1,y−1))/√2 (6c)
∇dI=(I(x,y)−I(x−1,y−2))/√5 (6d)
∇eI=(I(x,y)−I(x,y−1)) (6e)
∇fI=(I(x,y)−I(x+1,y−2))/√5 (6f)
∇gI=(I(x,y)−I(x+1,y−1))/√2 (6g)
∇hI=(I(x,y)−I(x+2,y−1))√5 (6h)
The energy function Ex=Σi,j/∇xI(i,j)/ with xε{a,b,c,d,e,f,g,h} can be used for direction determination, e.g. by selecting x such that Ex is maximized.
In an exemplary embodiment, the following edge significant indicators are used:
R
a
=E
a
/E
e=1/Re (7a)
R
b
=E
b
/E
f=1/Rf (7b)
R
c
=E
c
/E
g=1/Rg (7c)
R
d
=E
d
/E
h=1/Rh (7d)
Then, the direction is determined as follows:
(a) Pre-processing the image in units of n×n blocks for obtaining all candidate directional gradients, where n is the block size; and
(b) Calculating Rx for each directional gradient and selecting the direction with largest Rx in case /Rx−Ry/≧thr for all y≠x, where thr is a predefined threshold.
In case multiple edge significant indicator candidates Rx, Ry are similar, i.e. /Rx−Ry/<thr, and there is a direction y which is more similar to the direction of the upper and/or left block's direction, y is selected instead of x even if x is the direction with largest Rx.
Then, total variation of the determined direction and the direction orthogonal thereto is determined as
TV
DIR(I)=Σi,jsqrt(αi,j∇e
Where ∇e
In a first exemplary embodiment, TVDIR(I) is used in denoising by finding f which minimizes
TV
DIR(f)+λ*(∥f−n∥2)2/2 (9)
where n is the input noisy image. The edge directions are determined as described above.
In an exemplary embodiment targeting uniform-intensity denoising, the weights in TVDIR(f) can be normalized by Ci,j=sqrt(αi,j+βi,j):
TV
DIR(I)=Σi,jsqrt(αi,j∇e
Then, the denoising intensity is merely dependent on the weighting parameter λ.
TV regularization based reconstruction makes it possible for the video codec to recover pictures with incomplete DCT coefficients. TV regularization is performed in unit of block instead of the whole frame. The reconstruction is based on Eq. (11), where bp is the prediction of block b and Φ is the DCT and quantization process.
minbTV(b)+λ*(∥y−Φ(b−bp)∥2)2/2 (11)
In an exemplary embodiment where directions and weights are consistent per block, TVDIR(f) can be simplified using C=sqrt(α+β) by:
TV
DIR(I)=Σi,jsqrt(αi,j∇e
First, the available but incomplete DCT coefficients are used to construct an initial block reconstruction, binit=Φ−1y+bp, to determine the edge direction, where Φ−1 is the de-quantization and inverse DCT process.
Since the pixels on the bottom and right are unavailable for the current block, the gradients of outer pixels may not be computed. For the example of a block of size four in
Therefore, a parameter cijk is defined to denote the availability of the gradient of the pixel (i,j) along the direction k, with cijk=1 if ∇kI(i,j) is available, cijk=0 if ∇kI(i,j) is unavailable.
In an exemplary embodiment where availability of pixels required for gradient calculation is considered, the Directional TV is calculated by the mean value of the available gradients scaled by the total number of pixels, i.e. TVDIR(f) can be simplified using Ck=n2*sqrt(α+β)/Σi,jcijk by:
TV
DIR(I)=Σi,jsqrt(αi,j∇e
Embodiments of the proposed invention can be used for image processing applications like denoising or deblurring.
In one embodiment, the pre-processing step 31 comprises dividing 34 the image into patches and determining a predominant direction for each patch.
In one embodiment, the pre-processing step 31 comprises calculating 35a gradients for each patch, as in eq. (5a-h) or eq. (6a-h), calculating 35b an energy function Ex,x=a, . . . , h for each gradient, calculating 35c a ratio Rx,x=a, . . . ,h between energy functions of orthogonal gradients, as in eq. (7a-d), and determining the predominant direction according to the energy function ratio.
In one embodiment, the predominant direction is selected as the one that has a maximum energy function ratio.
In one embodiment, the predominant direction is only selected if its energy function ratio exceeds, by at least a predetermined positive threshold, each energy function ratio of gradients in different directions.
In one embodiment, selecting the predominant direction comprises detecting 36a that a current patch is in a flat region, detecting 36b that a predominant direction has already been determined for an upper and/or left neighbour patch of the current patch, and selecting 36c among available possible predominant directions the most similar direction for the current patch. This is particularly advantageous if e.g. multiple similar candidate directions of similar strength are available for a current patch.
Flat regions (i.e. regions with very low or no predominance of a direction) can be detected in various ways. In one embodiment, a flat region is detected by determining that the energy sums of gradients along multiple directions are equal, or at least substantially equal.
In one embodiment, for determining 32 a total variation of the image, the variation along the at least one predominant direction is given higher weight than the variation along a direction orthogonal to the at least one predominant direction.
In one embodiment, the method further comprises (e.g. in the total variation determining step 32, as exemplarily shown in
In one embodiment, the Directional Total Variation is calculated by scaling and averaging available gradients for patches if not all of the required gradients are available. This comprises a step of determining that not all of the required gradients are available.
In one embodiment, the pre-processing unit 41 comprises dividing means 44 for dividing the image into patches. Then, a predominant direction is determined for each patch.
In one embodiment, the pre-processing unit 41 comprises a calculating and determination unit 45 for calculating gradients for each patch, as in eq. (5a-h) or eq. (6a-h), calculating an energy function Ex,x=a, . . . , h for each gradient, calculating a ratio Rx,x=a, . . . , h between energy functions of orthogonal gradients, as in eq. (7a-d), and determining the predominant direction as the one that has a maximum energy function ratio.
In one embodiment, for selecting the predominant direction, the pre-processing unit 41 comprises a detection and selection unit 46 for detecting that a current patch is in a flat region, detecting that a predominant direction has already been determined for an upper and/or left neighbour patch of the current patch, and selecting among available possible predominant directions the most similar direction for the current patch.
In one embodiment, the TV determining unit 42 comprises calculation means 47 for calculating the Directional TV based on the gradients along the determined at least one predominant direction and its orthogonal direction.
In one embodiment, the TV determining unit 42 comprises a calculation unit 47 for calculating adaptive weights α,β for the weighted sum of variations along the at least one predominant direction and along a direction orthogonal to the at least one predominant direction.
The invention can advantageously also be used as a fundamental component of an image/video compression scheme, like a compressive sensing based compression approach.
Number | Date | Country | Kind |
---|---|---|---|
PCT/CN2011/076283 | Jun 2011 | CN | national |
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/CN2012/077138 | 6/19/2012 | WO | 00 | 12/22/2013 |