The disclosure generally relates to digital image processing.
Modern digital photography equipment and related software editing applications have simplified capturing, viewing, storing and editing digital images. For example, a user may capture a blurred image with a digital camera and may desire to sharpen the blurred digital image. Some digital image editing applications allow a user to manually sharpen blurred images by manually adjusting blur radius and noise values for a blurred image. Such manual adjustment can be tedious and time consuming. Moreover, the process of manually sharpening digital images may introduce artifacts, such as edge ringing, into the sharpened image.
Automatic image sharpening techniques are disclosed that automatically bring a blurred image into focus. Techniques for reducing edge ringing in sharpened images are also disclosed. According to implementations, a computer-implemented method includes determining a normalized entropy of a first image, calculating a correlation target based on the normalized entropy, automatically determining a blur radius of a de-convolution kernel that causes a cosine of a first radial power spectrum of the kernel and a second radial power spectrum of a reconstruction of the first image to approximate the correlation target, and generating a second image based on the blur radius.
According to an implementation, automatically determining a blur radius includes automatically adjusting a blur radius of the de-convolution kernel. The method may include extending one or more dimensions of an original image to define a space around the original image and reflecting edge portions of the original image into the space to generate the first image. The original image may be extended and padded such that the asymptotic exterior value of the first image in each color channel is the mean value of each color channel of the original image.
According to implementations, the computer-implemented method may also include calculating the correlation target (C) according to the following equation: C=a+(b+c(sgn(En,o)))En,o, where a, b, and c are coefficients and En,o is the normalized entropy of the first image. The normalized entropy may be restricted to an interval by hard clipping.
Particular implementations provide one or more of the following advantages: 1) an image can be automatically sharpened without the use of any human input, and 2) edge ringing in the sharpened image is minimized. Automatic image sharpening can be used in a variety of applications including but not limited to monocular 3D sensing and batch-focusing applications.
A computer program product and a system for automatic image sharpening are also disclosed. Details of one or more implementations are set forth in the accompanying drawings and the description below. Other features, aspects, and potential advantages will be apparent from the description and drawings, and from the claims.
Like reference symbols in the various drawings indicate like elements.
Automatic image sharpening techniques are disclosed that automatically bring a blurred image into focus. Techniques for reducing edge ringing in sharpened images are also disclosed.
Wiener de-convolution is a process whereby an approximate de-convolution kernel is created in spectral space. Denote an image intensity by p({right arrow over (r)})=p(x,y), and denote its 2-dimensional Fourier transform by:
P({right arrow over (ω)})=∫p({right arrow over (r)})e−i{right arrow over (ω)}·{right arrow over (r)}d2{right arrow over (r)}. [1]
Where the integration range is not specified, the transformation is a discrete fast Fourier result, namely:
for image dimensions W×H (width×height). A blurred image is taken to be a convolution
q({right arrow over (r)})=p×b:=∫p({right arrow over (r)}−{right arrow over (s)})b({right arrow over (s)})d2{right arrow over (s)}, [3]
where b is a blur kernel and p denotes an original (presumed sharp) image. The blurred image q is modeled as a finite, discrete convolution, rather than an integral. The classical convolution theorem indicates that the blurred image q has Fourier transform given by
Q({right arrow over (w)})=P({right arrow over (w)})·B({right arrow over (w)}), [4]
where B is the Fourier transform of the blur kernel. To de-convolve (sharpen) the blurred image an inversion relation such as
can be inferred to transform q back to the original image p. Equivalently, an “inverse kernel” e({right arrow over (r)}) may exist such that
e×b˜δ, [6]
where δ is the unit 2-dimensional impulse. In spectral terms, this is
The essential de-convolution operation attempts recovery of the original image p from the blurred image q via the convolution p˜q×e. The rest of the present treatment focuses upon the problem of calculating a suitable, approximate inverse kernel e.
If an inverse kernel's Fast Fourier Transform (FFT) E(w) is well approximated, it may be used to approximate the original image. The problem is that taking just the reciprocal of the blur kernel's FFT, namely setting E:=1/B, induces singularities, or at least specious behavior (excessive ringing, for example) due to possibly small values of |B|. There are at least three variants of the Wiener prescription, by which variants the theoretical inverse-kernel transform E is replaced by a non-pathological approximation E′. The models are defined as follows:
(1) The constant noise model:
where η is a constant preventing zeros in the denominator.
(2) The floor model:
where η prevents denominator zeros.
(3) The full noise model:
where N/S is a frequency-dependent noise-to-signal ratio.
An automatic means of estimating N/S for variant (3) is now described. Variant (2), which provides a certain stability that accrues with the floor method, is described further below.
Given an approximate Wiener kernel E′, a reconstructed image
presumably, has the desirable property of being, for proper Wiener parameters, close to the sharp original namely ρ˜ρ in some visually appropriate sense. Based on this presumption, blurred image de-convolution is described below.
According to implementations, automatic image sharpening is performed on a rectangular selection (“original image”) that has been padded according to the extension-and-padding rules disclosed above. Automatic image sharpening performed according to implementations disclosed herein produces an explicit r, η (blur-radius, noise) pair used to focus the original image.
At step 304, the extended and padded blurred image is analyzed to automatically estimate the amount of noise in the image. For example, automatic noise adjustment and estimation may start with a radial-FFT histogram. For the two indices (u,v) of the standard, 2-dimensional FFT on the extended and padded image, we zero histogram-bin values h(1) . . . h(L) where L:=┌N/√{square root over (2)}┐ for FFT side N (square, N×N FFTs are used). Then if p(u,v) denotes the local spectral power, namely ≡Xu,v|2, then an L×L square is passed over the image, and values are added into histogram bins according to
h(┌√{square root over (u2+v2)}+1/2┐)+=p(u,v), [12]
whenever the argument of h here is in [1, L]. For example, this way of forming a radial histogram weights the higher frequencies because of their larger arcs in the FFT square. For white noise that is transformed by an FFT to give a statistically uniform FFT power-square, the histogram element h(w) should rise linearly in integer w. For this reason, the radial power spectrum is denoted by
Assuming that the original 2-dimensional signal has Laplacian autocorrelation, the radial power spectrum can be inferred to be approximately
Rth(w)˜∫e−i{right arrow over (ω)}·{right arrow over (r)}e−Krd2{right arrow over (r)}, [14]
where {right arrow over (w)} is any fixed vector of length w, and K is the Laplacian decay constant. This 2-dimensional integral admits of exact evaluation, and the radial-FFT histogram approximation may be expressed as
for constants A, B. The ensuing algorithmic development involves stable estimation of the A, B parameters.
According to implementations, the noise/signal Wiener kernel may be approximated by
and S/N is the signal-to-noise ratio (SNR). To affect an auto-noise algorithm in this way, the following nomenclature is established. Assign
The noise-function and signal-function forms may be approximated by
respectively, and the above parameter assignments may be used to estimate A, B. The Wiener kernel, thought of as depending on a low-frequency “signal” and a “noise” tail for the guess Rth, works out to be
in this kernel formula. The constant η is a tuning constant.
At step 306, the normalized entropy of the image is calculated, according to implementations. For example, normalized entropy, denoted En,o, may be calculated by first applying a predictor/corrector (P/C) to the luminance (Y) channel, in the form a+d−b−c, for the 4 pixel arrangement:
Then, the entropy of the resulting P/C data is calculated. The entropy calculation is done with 128 bins from [−62, 63], which are used to generate a histogram, with overflow and underflow clamped to maximum and minimum values. Once the bins are filled, then the “standard” entropy Es is calculated as the usual weighted-logarithmic sum over bins. Then, the normalized entropy En,o may be calculated by
En,o=Es−log2(M/16), [29]
where M denotes the mean of the Y-channel. This adjustment is intended to nullify the effects of overall image brightness based on the idea that the ultimate entropy of a dim version of an image should have entropy equal to a brighter version of the image.
To calculate high-dimensional dot products in spectral space, consider the radial power spectrum R(w) depending on a scalar parameter w=|w| with w deemed 2-dimensional. For any of the computed 2-dimensional FFTs, one may employ bin averaging to estimate such an R. Denote by Rw the radial power spectrum for the Wiener kernel E′({right arrow over (w)}), and by Rr the power spectrum for the reconstruction of a source image; this can be the power spectrum of the kernel Q({right arrow over (w)}) E′({right arrow over (w)}). The cosine between the “signals” Rw and Rr is defined as
The dot-product is obtained by summing over all radii such that
R:=√{square root over (R·R)}. [32]
So that Cw,r is a number that is the cosine of the dimensional angle between two radial power spectra.
At step 308, a theoretical correlation target is calculated, according to implementations. For example, starting with a default pair (r,η), for example (2.0, 0.45), and applying the floor noise model, as described above, a theoretical correlation target C (i.e., the vertical coordinate of
C=a+(b+c(sgn(En,o)))En,o, [33]
where a, b and c are coefficients determined based on empirical data and where En,o has been restricted to the interval [−2, 4] by hard clipping. For example, image analysis may indicate that a, b and c should vary based on the type of images being processed. According to at least one implementation, a=0.032, b=0.020 and c=0.004. Other values for these coefficients may be used to automatically sharpened images. According to implementations, the target C is a piecewise straight line, as illustrated by
At step 310, the blur radius is automatically adjusted to produce a sharpened image, according to implementations. For example, system 300 may automatically search for a Wiener de-convolution kernel blur radius rε[0,∞] that forces Cw,r≈C. According to implementations, Cw,r is the (raw) cosine between the radial power spectrum of the Wiener kernel and the radial power spectrum of the reconstruction of a source image, as defined above. Upon finding the optimal blur radius, the blur radius and noise values (r, η) may be used to automatically sharpen the original blurred image.
At step 404, a normalized entropy is calculated, according to implementations. For example, the normalized entropy of the image may be calculated in the manner disclosed in step 306 of
At step 406, a theoretical correlation target is calculated, according to implementations. For example, the theoretical correlation target may be calculated in the manner disclosed in step 308 of
At step 408, the blur radius is automatically adjusted according to implementations. For example, once the theoretical correlation target is calculated, an optimal blur radius may be determined, as detailed above with respect to step 310 of
Display device 506 can be any known display technology, including but not limited to display devices using Liquid Crystal Display (LCD) or Light Emitting Diode (LED) technology. Processor(s) 502 can use any known processor technology, including but are not limited to graphics processors and multi-core processors. Input device 504 can be any known input device technology, including but not limited to a keyboard (including a virtual keyboard), mouse, track ball, and touch-sensitive pad or display. Bus 512 can be any known internal or external bus technology, including but not limited to ISA, EISA, PCI, PCI Express, NuBus, USB, Serial ATA or FireWire. Computer-readable medium 510 can be any medium that participates in providing instructions to processor(s) 502 for execution, including without limitation, non-volatile storage media (e.g., optical disks, magnetic disks, flash drives, etc.) or volatile media (e.g., SDRAM, ROM, etc.).
Computer-readable medium 510 can include various instructions 514 for implementing an operating system (e.g., Mac OS®, Windows®, Linux). The operating system can be multi-user, multiprocessing, multitasking, multithreading, real-time and the like. The operating system performs basic tasks, including but not limited to: recognizing input from input device 504; sending output to display device 506; keeping track of files and directories on computer-readable medium 510; controlling peripheral devices (e.g., disk drives, printers, etc.) which can be controlled directly or through an I/O controller; and managing traffic on bus 512. Network communications instructions 516 can establish and maintain network connections (e.g., software for implementing communication protocols, such as TCP/IP, HTTP, Ethernet, etc.).
A graphics processing system 518 can include instructions that provide graphics and image processing capabilities. For example, the graphics processing system 518 can implement the processes, as described with reference to
Application(s) 520 can be an image processing application or any other application that uses the processes described in reference to
The described features can be implemented advantageously in one or more computer programs that are executable on a programmable system including at least one programmable processor coupled to receive data and instructions from, and to transmit data and instructions to, a data storage system, at least one input device, and at least one output device. A computer program is a set of instructions that can be used, directly or indirectly, in a computer to perform a certain activity or bring about a certain result. A computer program can be written in any form of programming language (e.g., Objective-C, Java), including compiled or interpreted languages, and it can be deployed in any form, including as a stand-alone program or as a module, component, subroutine, or other unit suitable for use in a computing environment.
Suitable processors for the execution of a program of instructions include, by way of example, both general and special purpose microprocessors, and the sole processor or one of multiple processors or cores, of any kind of computer. Generally, a processor will receive instructions and data from a read-only memory or a random access memory or both. The essential elements of a computer are a processor for executing instructions and one or more memories for storing instructions and data. Generally, a computer will also include, or be operatively coupled to communicate with, one or more mass storage devices for storing data files; such devices include magnetic disks, such as internal hard disks and removable disks; magneto-optical disks; and optical disks. Storage devices suitable for tangibly embodying computer program instructions and data include all forms of non-volatile memory, including by way of example semiconductor memory devices, such as EPROM, EEPROM, and flash memory devices; magnetic disks such as internal hard disks and removable disks; magneto-optical disks; and CD-ROM and DVD-ROM disks. The processor and the memory can be supplemented by, or incorporated in, ASICs (application-specific integrated circuits).
To provide for interaction with a user, the features can be implemented on a computer having a display device such as a CRT (cathode ray tube) or LCD (liquid crystal display) monitor for displaying information to the user and a keyboard and a pointing device such as a mouse or a trackball by which the user can provide input to the computer.
The features can be implemented in a computer system that includes a back-end component, such as a data server, or that includes a middleware component, such as an application server or an Internet server, or that includes a front-end component, such as a client computer having a graphical user interface or an Internet browser, or any combination of them. The components of the system can be connected by any form or medium of digital data communication such as a communication network. Examples of communication networks include, e.g., a LAN, a WAN, and the computers and networks forming the Internet.
The computer system can include clients and servers. A client and server are generally remote from each other and typically interact through a network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.
One or more features or steps of the disclosed embodiments can be implemented using an API. An API can define on or more parameters that are passed between a calling application and other software code (e.g., an operating system, library routine, function) that provides a service, that provides data, or that performs an operation or a computation.
The API can be implemented as one or more calls in program code that send or receive one or more parameters through a parameter list or other structure based on a call convention defined in an API specification document. A parameter can be a constant, a key, a data structure, an object, an object class, a variable, a data type, a pointer, an array, a list, or another call. API calls and parameters can be implemented in any programming language. The programming language can define the vocabulary and calling convention that a programmer will employ to access functions supporting the API.
In some implementations, an API call can report to an application the capabilities of a device running the application, such as input capability, output capability, processing capability, power capability, communications capability, etc.
One example application of automatic image sharpening is automatic depth-sensing using one image. For example, as each patch of a blurred image is processed, the blur radius from step 310 of
Another application is batch focus. For example, a large set of street-map images can be run through the automatic image sharpening processes described above to focus each single image. Focusing a large set of images can be impractical if the process requires manual-human input. Other image processing applications can also benefit from the automatic image sharpening described herein.
A number of implementations have been described. Nevertheless, it will be understood that various modifications may be made. For example, other steps may be provided, or steps may be eliminated, from the described flows, and other components may be added to, or removed from, the described systems. Accordingly, other implementations are within the scope of the following claims.
Number | Name | Date | Kind |
---|---|---|---|
6470097 | Lai et al. | Oct 2002 | B1 |
7184578 | Simon et al. | Feb 2007 | B2 |
7616826 | Freeman et al. | Nov 2009 | B2 |
20060139369 | Zimmer et al. | Jun 2006 | A1 |
20070019883 | Wong et al. | Jan 2007 | A1 |
20090066818 | Lim et al. | Mar 2009 | A1 |
20090316995 | Szeliski et al. | Dec 2009 | A1 |
Number | Date | Country |
---|---|---|
2009073006 | Jun 2009 | WO |
2009153717 | Dec 2009 | WO |
Entry |
---|
Kristan, Matej, and F. Pernus. “Entropy based measure of camera focus.” Proceedings of the thirteenth Electrotechnical and Computer Science Conference ERK. 2004. |
Kokaram, A. C., et al. “Restoration of images from the scanning-tunneling microscope.” Applied optics 34.23 (1995): 5121-5132. |
Le Cui et al: “Restoration of Defocus Blur Image Based on Global Phase Coherence”, Image and Signal Processing, 2009. CISP •G9. 2nd International Congress on, IEEE, Piscataway, NJ, USA, Oct. 17, 2009, pp. 1-5, XP031555959, ISBN: 978-1-4244-4129-7 abstract figure 8. |
Mohsen Ebrahimi Moghaddam Ed—Anonymous; “Out of focus blur estimation using genetic algorithm”, Systems, Signals and Image Processing, 2008. IWSSIP 2008. 15th International Conference on, IEEE, Piscataway, NJ, USA, Jun. 25, 2008, pp. 417-420, XP031310473, ISBN: 978-80-227-2856-0. |
Farzin Aghdasi et al: “Reduction of Boundary Artifacts in Image Restoration”, IEEE Transactions on Image Processing, IEEE Service Center, Piscataway, NJ, US, vol. 5, No. 4, Apr. 1, 1996, XP011025975, ISSN: 1057-7149. |
International Search Report and Written Opinion of the International Searching Authority, PCT Application Serial No. PCT/US2012/058034, Nov. 20, 2012, 13 pp. |
Number | Date | Country | |
---|---|---|---|
20130084019 A1 | Apr 2013 | US |