The present application claims the benefit of Australian Patent Application No. 2007-240236, filed on Dec. 11, 2007, the contents of which are hereby incorporated by reference as fully stated herein.
The present invention relates to the field of image registration and, in particular, to determining image patches with optimal locations and sizes for fine image alignment to a predefined level of accuracy.
Image registration is the process of determining correspondences between pixel elements in a pair of images that have common subject matter. It is an important technique in fields such as image matching, imaging device characterisation, and super-resolution. In image matching, two images are compared for common subject matter under the assumption that some geometrical transformation relates substantial portions of the two images.
Two images f1(x, y) and f2(x, y) can be related by a combination of Rotation, Scaling, and Translation (RST) transformations, such that:
f2(x,y)=f1(s(x cos θ+y sin θ)+Δx,s(−x sin θ+y cos θ)+Δy) (1)
wherein s is a scale factor, θ is a rotation angle, and (Δx, Δy) are translations in x and y directions. The unknown rotation θ and scale s parameters may be determined from translation invariant representations Tf
Tf=|F|+iA∇2φ (2)
where A is a scaling constant set to:
A=max(|F|)/π (3)
to ensure that the recombined Fourier magnitude and phase information are roughly of equal magnitude.
After the rotation θ and scale s parameters are determined, the image f2(x, y) can be transformed to correct for the rotation and scaling. The translation parameters (Δx, Δy) can then be estimated from the image f1(x, y) and the transformed image f′2 (x, y) by finding a distinct peak in a cross-correlation image:
C=ℑ−1(ℑ(f1)ℑ(f2)*) (4)
where ℑ(f) is the discrete Fourier transform of an image f(x, y), and ℑ(f2)* denotes the complex conjugation of the discrete Fourier transform ℑ(f2).
While a simple parametric transformation such as RST is suitable for registration of flat, rigid objects under frontal views, the transformation between two images in most applications is usually more complicated. In the general case, the transformation can be expressed as a free-form motion that warps one image onto a substantial part of the other image. This transformation is defined by a full warp map, which specifies a two-dimensional motion vector for each pixel of either image. This full warp map can be obtained from dense optical flow estimation. The full warp map can also be interpolated from a set of sparse point correspondences between the two images. Although there are many algorithms that detect and match salient points/regions from images, none of them is able to specify the level of registration accuracy that the algorithm returns.
Cramer-Rao Bound for Image Registration
The mean squared error of any estimate of a deterministic parameter in the presence of noise has a lower bound known as the Cramer-Rao Bound (CRB). Specifically, if a parameter vector m=[m0, m1, . . . mn]T is estimated from a given set of measurements, the CRB provides a lower bound on the error covariance matrix:
where ε={circumflex over (m)}−m is the estimation error (the hat sign denotes an estimate of the variable underneath); b=E[{circumflex over (m)}]−m is the bias of the estimator (E[.] is the expectation of the enclosed expression); I is the identity matrix; F (m) is the Fisher Information Matrix (FIM) that characterizes how well the unknown parameter vector m can be estimated from the observed data; and F−1 is the inverse of F. The ≧ sign in (5) means that the difference between the left matrix and the right matrix is non-negative definitive. As a result, the inequality holds for all diagonal terms.
If the estimator is unbiased (i.e., b=0), the expected variance of the parameters can be found directly from the main diagonal entries of the inverse matrix F−1:
E[({circumflex over (m)}i−mi)2]≧[F−1(m)]ii (6)
The Fisher information matrix is derived from the maximum likelihood principle. Let Pr(r|m) be the probability density function of an observed noisy data r(m), the Fisher information matrix is a measure of the steepness of the likelihood function around its peak:
Since the peak of a steep likelihood function is less sensitive to noise than that of a smooth one, the FIM characterizes how precisely m can be estimated from the observed data.
Fisher Information Matrix for Image Registration
A direct image registration method searches for a parametric transformation between the coordinate systems of two images based on their intensity correlation. Assuming that both images I1 and I2 are noise corrupted versions of a noiseless scene I by two instances of zero-mean Gaussian noise with variance σn,
I*1(x,y)=I1(x,y)+n1(x,y)=I(x,y)+n1(x,y)
I*2(x,y)=I1(x,y)+n2(x,y)=I(x′,y′)+n2(x,y) (8)
where x′=f(x, y, m) and y′=g(x, y, m) are the coordinate transformations, and m=[m1, m2, . . . mn]T is the unknown registration parameter (e.g., under translation x′=x−tx, y′=y−ty and m=[tx, ty]T). Since the noise realizations n1 and n2 are normally distributed over the registration region S, the total probability of the unknown scene I given an estimate of m is:
where the implicit coordinates for I1, I2 and I is (x, y) except for I′=I(x′, y′). The log-likelihood function therefore is:
From (7), the Fisher information matrix for a n-parameter vector m is thus a n×n matrix F with its entries computed as:
where the derivative of the noiseless image I′=I(x′, y′) with respect to each unknown parameter mi can be computed from its spatial derivatives and the registration model {x′, y′}={f(x, y, m), g(x, y, m)}:
Cramer-Rao Bound for 2D Shift Estimation
Using the general derivation of the Fisher information matrix in the previous subsection, the Cramer-Rao bound for any unbiased shift estimator can be derived. Two-dimensional (2D) shift estimation looks for a translational vector t=[tx, ty]T between the coordinate systems of the two images: x′=x−tx and y′=y−ty. The Fisher information matrix can be computed from (11) and (12):
where lx=∂l′l∂x′=∂l′l∂x and ly=∂l′l∂y′=∂l′l∂y are spatial derivatives of the uncorrupted image I′. As can be seen in (13), the FIM for 2D shift estimation is proportional to a Gradient Structure Tensor T (GST) integrated over the region S.
Substitution of (13) into (6) yields the Cramer-Rao bounds of the variances of the registration parameters:
where
is the determinant of T. Ignoring the second term of det(T), the Cramer-Rao bounds (14) are simplified to looser bounds:
which clearly shows that the shift variance is linearly proportional to the input noise variance σn2 and inversely proportional to the total gradient energy in the shift direction. As a result, scenes with strong textures and little noise are likely to result in accurate shift estimation. However, the equality of the loose bound in (15) is hardly achievable (since
only vanishes when the orientation of maximum gradient energy is aligned with one of the grid axes).
Note that the CRB characterizes the shift variances based on an uncorrupted signal I, which is not available in practice. Fortunately, the total gradient energies of I can be approximated from those of the corrupted image I*1 and a noise instance n given a normal distribution N(0, σn):
where l*x=∂l*1 l∂x, l*y=∂l*1 l∂y, nx=∂nl∂x, and ny=∂nl∂y.
Cramer-Rao Bound for 2D Projective Registration
The Cramer-Rao bound is not only applicable to shift estimation, but also to more general motion models such as 2D affine and projective transformation. A 2D projective transformation, for example, is the motion of a static scene captured by a stationary camera or the motion of a moving planar surface. It is the most general planar motion model which includes translation, Euclidean, similarity, and affine transformations. Similarly to 2D translation, the Cramer-Rao bounds for the eight projective parameters are computed from an 8×8 Fisher information matrix.
Planar projective registration seeks an 8-parameter vector m=[m1, m2, . . . m8]T that transforms one coordinate system (x, y) into another (x′, y′):
Substituting
into (12) yields:
The 8×8 Fisher information matrix is rewritten from (11) and (19) as:
Due to a complex 8×8 matrix inversion, the exact formula for the Cramer-Rao bounds of 2D projective registration is not given here. The bounds can be computed from the diagonal entries of the inverse Fisher information matrix F−(m). Similarly to the shift estimation case, the lower variance bounds of the 2D projective parameters are proportional to the input noise variance and inversely proportional to the total gradient energy.
The Fisher information matrix, or alternatively the gradient structure tensor and hence the Cramer-Rao Lower Bound, quantifies the amount of information in an image for the determination of the n-parameter vector m. For shift estimation, the correlation information determines how precisely the shift can be estimated for a given area, given a certain amount of noise corrupting the image.
It is an object of the present invention to overcome substantially, or at least ameliorate, one or more disadvantages of existing arrangements.
According to a first aspect of the present disclosure, there is provided a computer-implementable method of estimating a geometrical relationship between a first image and a second image, wherein the second image includes a noise component. The method determines a location and size of each one of a plurality of image patches, based on the noise component included in the second image and correlation information derived from the first image. The method then identifies a plurality of first image areas in the first image and a corresponding plurality of second image areas in the second image, based on the location and size of each one of the plurality of image patches. Each first image area of the first image corresponds to a related second image area of the second image. The method then determines a geometrical relationship between the first and second images by comparing, for each one of the first image areas, information located within the first image area with information located within the corresponding related second image area.
According to a second aspect of the present disclosure, there is provided a computer program product including a computer readable storage medium having recorded thereon a computer program for directing a processor to execute a method for of estimating a geometrical relationship between a first image and a second image. The second image includes a noise component. The computer program product comprising: (a) code for determining a location and size of each one of a plurality of image patches, based on the noise component included in the second image and correlation information derived from the first image; (b) code for identifying a plurality of first image areas in the first image and a corresponding plurality of second image areas in the second image, based on the location and size of each one of the plurality of image patches, wherein each first image area of the first image corresponds to a related second image area of the second image; and (c) code for determining a geometrical relationship between the first and second images by comparing, for each one of the first image areas, information located within the first image area with information located within the corresponding related second image area.
According to another aspect of the present disclosure, there is provided an apparatus for implementing the aforementioned method.
According to yet another aspect of the present disclosure, there is provided a computer program product including a computer readable medium having recorded thereon a computer program for implementing the method described above.
Other aspects of the invention are also disclosed.
One or more embodiments of the invention will now be described with reference to the following drawings, in which:
Where reference is made in any one or more of the accompanying drawings to steps and/or features that have the same reference numerals, those steps and/or features have for the purposes of this description the same function(s) or operation(s), unless the contrary intention appears.
As described above, current image registration algorithms are unable to provide an estimate of the geometrical relationship between two images to a specified level of accuracy. An embodiment of the present disclosure employs a method of estimating the precision of motion estimation under the presence of noise. Based on this precision analysis, a set of image patches with optimal locations and sizes are selected to sparsely align the input images up to a user-specified accuracy.
Disclosed herein is a computer-implementable method of determining a geometrical relationship between a first image and a second image, wherein the second image includes a noise component. The method utilises the noise component included in the second image and correlation information derived from the first image to determine a location and size of each one of a plurality of image patches. Where a size of a first image patch is smaller than a size of a second image patch, there is more information in a region of the first image corresponding to the location of the first image patch than a region of the first image corresponding to the location of the second image patch. The method includes a step of identifying a plurality of first image areas in the first image and a corresponding plurality of second image areas in the second image, based on the location and size of each one of the plurality of image patches. Each first image area of the first image corresponds to a related second image area of the second image, and thus the first image area and corresponding second image area define related portions of the first and second images. The method then compares, for each one of the first image areas, information located within the first image area with information located within the corresponding related second image area to determine a geometrical relationship between the first and second images.
Also disclosed herein is a computer program product including a computer readable storage medium having recorded thereon a computer program for directing a processor to execute a method for of estimating a geometrical relationship between a first image and a second image, said second image including a noise component. The computer program product includes code for performing the above-mentioned method.
Further disclosed herein is a method of determining a full warp map that relates a corrupted image to a reference image. The method coarsely aligns a corrupted image against a reference image by applying a rotation-scaling-translation transformation. A plurality of correlatable image patches are then determined from the reference image. Locations and sizes of the respective image patches are determined using a theoretical variance bound. A full warp map is then determined using the image patches.
In one example, the images (i.e., the reference image and a corrupted version of the reference image) are obtained from a group of images that includes, for example, but is not limited to: (a) digital images of a document; (b) a video sequence that contains independently moving objects; (c) digital images of a scene with overlapping content; and (d) a digital image of a test chart.
Before providing a description of a preferred embodiment through a set of drawings, the theory of using the Cramer-Rao Lower Bound (CRLB) of image registration to select correlatable image patches will be described. In this disclosure, the term “correlatable patch” refers to an image patch of smallest possible size that is likely to give a desired alignment precision in the presence of known noise variance.
Equation (14) described above in the background section specifies the CRLB of shift estimation variance along the sampling directions. However, the CRLB also applies to the variance of shift estimation in any other directions. For example, if the Fisher Information Matrix F (FIM) for 2D shift estimation is decomposed into its two principal directions:
where u and v are the eigen-vectors of F with the corresponding eigen-values λu≧λv, the CRLB for the estimated shift variance along those two directions tu=t.u and tv=t.v are the projections of the inverse FIM onto those principal directions:
Since
the variance of the shift magnitude |t| is bounded by the CRLB of shift variance along the direction of minimum gradient change:
The CRLB of shift variance in the direction v therefore defines how precise the shift can be estimated from a given image patch and its noise-corrupted version. This quantifies the amount of correlation information in the patch or, equivalently, its correlatability. For example, image patches composed of linear structures along only one orientation are not suitable for fine alignment because the displacement along this orientation cannot be measured accurately:
To solve for the eigen-values of the FIM in (13):
both λu and λv must satisfy the following equation: Av=λv, where v is the corresponding eigenvector. This leads to:
where Δ=(a+c)2−4(ac−b2)=(a−c)2+4b2. From (24) and (23), it is clear that λv is proportional to the total gradient energy within the patch and it is inversely proportional to the noise variance. As a result, given a known noise variance, the CRLB of the variance of shift estimation can be reduced by increasing the image patch size. This leads to a strategy to select the smallest image patches that still encapsulate enough gradient energy to satisfy the alignment precision specified by the CRLB.
To minimize computation, we are interested in the smallest patches whose CRLB falls just below the desired alignment accuracy. For example, if the desired alignment accuracy is σt=0.01 of a pixel,
should be less than or equal to σt2=10−4. Before locating these patches, we estimate the desired patch size at each pixel that gives rise to λv=T=1/σt2. This produces a Minimum Correlatable Patch Size (MCPS) image. Best patch locations are selected as local minima of this MCPS image. Apart from the minimum bound criteria, we also want the selected patches at a certain distance apart as well as evenly distributed across the image. Finally, image patches of periodic contents are not good candidates for alignment, because the self-matching characteristics of these periodic patterns may result in ambiguous displacements later in the alignment process.
An application of the proposed alignment system using the CRLB during patch selection is now described with reference to
The process of registering the images 101 and 102 performed by the image registration system 103 is now described in more detail with reference to
The process of selecting correlatable patches from an image in step 201 is now described in more detail with reference to
The process of constructing a λv scale-space of increasing patch sizes in step 301 is now described in more detail with reference to
wt=(λvi+1−T)wi+(T−λvi)wi+1.
In equation (23), the image integration over a square box can be implemented very efficiently in a separable fashion using two one-dimensional (1-D) unit filters in the x- and y-dimensions. Each of these 1-D unit filters produces the sum of w consecutive pixels within a sliding window. Since this window is moved across the image only one pixel at a time, the sum of the new window may be computed from the sum of the previous window by subtracting the intensity value of the pixel no longer included in the window, and adding the intensity value of the newly included pixel. In this way, a 2-D unit filter of arbitrary filter size w×w only costs four additions per pixel.
The process of determining a minimum correlatable patch size at every pixel to satisfy the alignment accuracy requirement λv=1/σt2 in step 302 is now described in more detail with reference to
equals the target alignment variance σt2. This optimal patch size can be estimated using linear interpolation from the adjacent patch size samples whose corresponding λv intensities enclose the threshold T (e.g., w2=8 and w3=16 as shown in
The process of detecting periodic content of an image patch in step 304 is now described in more detail with reference to
In the foregoing matter, the analysis of correlatability was applied to a single channel of the image. For instance, if the two images being aligned were RGB colour images, the alignment could be done against any of the R, G, or B channels separately, or against some combination of the colour channels, such as luminance
Y=0.3R+0.59G+0.11B
This approach does not take into account all of the information in the images that can be used for alignment. For instance, if alignment on the luminance channel is done, then regions of the images that are iso-luminant but polychromatic will be judged by the foregoing process to have low correlatability. This is because the correlatability analysis of the image does not take into account the colour content of the image.
A variation of the embodiment described above rectifies this problem by including colour information in the correlatability analysis. The simplest way to do this is to consider correlations of the colour vectors themselves. In this case, equation (8) may be written
I*1(x,y)=I(x,y)+n1(x,y)
I*2(x,y)=I(x′,y′)+n2(x,y) (25)
where bold letters denote vector quantities. In equation (25), we are assuming that the noise in different colour channels are normally distributed, with a cross correlation matrix given by
where ρrg etc. are the correlation coefficients for the noise in the different channels.
Under this noise model, the log probability of the observed images is
and the Fisher information matrix may be shown to be
For completeness, the inverse of the cross-correlation matrix may be written
For the case of shift estimation, the FIM is still a 2×2 matrix. If the noise in each of the colour channels is uncorrelated, then this reduces to the sum of the Fisher information matrices of the individual colour channels. This FIM may be used instead of the FIM identified in equation (13) in all correlatability processing to determine the best correlatable patches. If this is done, then shift estimation must be done using all of the colour channels. This involves determining the parameters m that minimise the log probability given above.
Equivalently, one can maximise the correlation between the images, given by
Similarly to the case for the FIM analysis, for uncorrelated noise in the different colour channels, this reduces to maximising the sums of the correlations in each colour channel weighted by the variance in each colour channel:
where the individual colour channels are labelled r, g, and b. In this case, each of the colour channels must be correlated, meaning that 3 correlations must be performed instead of just 1 in the case of a monochromatic or luminance image. If the noises are correlated, then all nine correlations between the colour channels must be calculated. This is a large increase in the computational cost of the algorithm.
An alternative approach that does not increase the computational cost so significantly is to determine a projection operator from the colour space onto one or two dimensions that give the highest correlatability for a given image region. When two image regions are to be aligned, then the regions are projected into this optimal space, and the correlation is done between the projections. This reduces the computational cost of the algorithm and does not significantly reduce its accuracy, as colours in images are often embedded in lower dimensional manifolds within the 3 dimensional colour space.
To formulate this problem, consider a direction in colour space that we will denote by v=(νr,νg,νb) and which has unit magnitude, |v|=1. For a given image region, we want to determine the direction of this vector that minimises the variance of the registration parameters. To do this we must determine the FIM for this problem, which in turn is dependent on the log likelihood of the projected images. If we assume that the noise in the images is a multivariate Gaussian distribution with zero mean, then the noise of any projection of the image data onto a single dimension will be a Gaussian distribution with zero mean. Its variance will be given by
σν2=νr2σr2+νg2σg2+νb2σb2+2νrνgρrgσrσg+2νrνbρrbσrσb+2νgνbρgbσgσb (32)
Thus, the log-likelihood of the images is given by
and the Fisher information matrix is given by
In shift estimation, the FIM may be written
To minimise the variance of the estimated shift parameters, we want to maximise the smallest eigenvalue of F subject to |v|=1. Alternately we may wish to maximise the product of the two eigenvalues, which is the same as maximising the determinant of the Fisher information matrix.
For shift estimation, the FIM can be expanded in terms of the image components. If we write
then the FIM may be written
with
If we are maximising the minimum eigenvalue, then for a given patch, the direction in colour space for correlatability analysis is given by the values of (νr,νg,νb) which maximises
L=α00+α11−√{square root over ((α00−α11)2+4α012)} (39)
subject to the constraint that
νr2+νg2+νb2=1 (40)
This is a constrained maximisation problem that may be solved using standard numerical techniques, such as using the method of Lagrange multipliers to convert it to an unconstrained minimisation which may be solved using the Levenberg-Marquardt method. The value that L takes at the maximum is the correlatability for the patch.
In terms of the processing pipeline performed above, in step 301, when the lambda scale space is constructed, this operation is performed on all three channels. In step 302, when the minimum correlatable patch size is calculated, then the box filters of all the colour image derivative pairs,
etc., must be calculated and substituted into the expression for L given above. Optimisation of the correct values of (νr,νg,νb) proceeds and the resulting correlatability is stored for later use.
When storing the correlatable patches, the values of (νr,νg,νb) should also be stored, so that the target patch to be aligned can also be transformed to the projected one dimensional colour channel before correlation is performed.
There is also another even simpler method of determining a direction in colour space in which to perform correlation, but it is not as optimal in terms of the image information content as the approach described above. If a patch is to have its correlatability assessed, then a principal component analysis of the colours in the patch may be performed to determine which direction in colour space has the greatest variation. The image data is then projected onto this direction, and the standard correlatability introduced in equation (24) is used. This can be extended to a two-dimensional analysis by taking the first two eigenvectors and projecting onto a two dimensional subspace. A complex number valued image can then be formed by using the projection onto the first eigenvector as the real part of the image and the projection onto the second eigenvector as the imaginary part of the image. These complex images can then be correlated using standard techniques.
This approach of performing Principal Component Analysis (PCA) on the colour data in a patch gives different results to the approach based on the Fisher Information Matrix, but practically speaking, this method is much faster. Also, it does not allow for correlated noise and noise of different amplitudes in the different colour channels, which are effects that happen in practice in imaging.
Applications of the fine image registration technique disclosed herein are numerous. Due to its adjustable computational footprint, the technique can be used by low-resourced systems when speed is more important than sub-pixel accuracy. Document alignment in MFPs (Multi-Function Peripherals) and panoramic image stitching on digital cameras are typical examples of this low-accuracy-end application. On the contrary, the fine image registration technique can be tuned for topmost accuracy at the expense of heavier computation. Off-line test chart alignment for imaging device calibration and super-resolution from video sequences using optical flow are examples of this high-accuracy-end spectrum.
The method of determining a geometrical relationship between a first image and a second image may be implemented using a computer system 700, such as that shown in
As seen in
The computer module 701 typically includes at least one processor unit 705, and a memory unit 706 for example formed from semiconductor random access memory (RAM) and read only memory (ROM). The module 701 also includes an number of input/output (I/O) interfaces including an audio-video interface 707 that couples to the video display 714 and loudspeakers 717, an I/O interface 713 for the keyboard 702 and mouse 703 and optionally a joystick (not illustrated), and an interface 708 for the external modem 716 and printer 715. In some implementations, the modem 716 may be incorporated within the computer module 701, for example within the interface 708. The computer module 701 also has a local network interface 711 which, via a connection 723, permits coupling of the computer system 700 to a local computer network 722, known as a Local Area Network (LAN). As also illustrated, the local network 722 may also couple to the wide network 720 via a connection 724, which would typically include a so-called “firewall” device or similar functionality. The interface 711 may be formed by an Ethernet™ circuit card, a wireless Bluetooth™ or an IEEE 802.11 wireless arrangement.
The interfaces 708 and 713 may afford both serial and parallel connectivity, the former typically being implemented according to the Universal Serial Bus (USB) standards and having corresponding USB connectors (not illustrated). Storage devices 709 are provided and typically include a hard disk drive (HDD) 710. Other devices such as a floppy disk drive and a magnetic tape drive (not illustrated) may also be used. An optical disk drive 712 is typically provided to act as a non-volatile source of data. Portable memory devices, such optical disks (eg: CD-ROM, DVD), USB-RAM, and floppy disks for example may then be used as appropriate sources of data to the system 700.
The components 705, to 713 of the computer module 701 typically communicate via an interconnected bus 704 and in a manner which results in a conventional mode of operation of the computer system 700 known to those in the relevant art. Examples of computers on which the described arrangements can be practised include IBM-PCs and compatibles, Sun Sparcstations, Apple Mac™ or alike computer systems evolved therefrom.
Typically, the application programs discussed above are resident on the hard disk drive 710 and read and controlled in execution by the processor 705. Intermediate storage of such programs and any data fetched from the networks 720 and 722 may be accomplished using the semiconductor memory 706, possibly in concert with the hard disk drive 710. In some instances, the application programs may be supplied to the user encoded on one or more CD-ROM and read via the corresponding drive 712, or alternatively may be read by the user from the networks 720 or 722. Still further, the software can also be loaded into the computer system 700 from other computer readable media. Computer readable storage media refers to any storage medium that participates in providing instructions and/or data to the computer system 700 for execution and/or processing. Examples of such media include floppy disks, magnetic tape, CD-ROM, a hard disk drive, a ROM or integrated circuit, a magneto-optical disk, or a computer readable card such as a PCMCIA card and the like, whether or not such devices are internal or external of the computer module 701. Examples of computer readable transmission media that may also participate in the provision of instructions and/or data include radio or infra-red transmission channels as well as a network connection to another computer or networked device, and the Internet or Intranets including e-mail transmissions and information recorded on Websites and the like.
The second part of the application programs and the corresponding code modules mentioned above may be executed to implement one or more graphical user interfaces (GUIs) to be rendered or otherwise represented upon the display 714. Through manipulation of the keyboard 702 and the mouse 703, a user of the computer system 700 and the application may manipulate the interface to provide controlling commands and/or input to the applications associated with the GUI(s).
The method of determining a geometrical relationship between a first image and a second image may alternatively be implemented in dedicated hardware such as one or more integrated circuits performing the functions or sub-functions of coarsely aligning images, determining locations of correlatable image patches, determining a warp map, determining image areas, and determining image relationships. Such dedicated hardware may include graphic processors, digital signal processors, or one or more microprocessors and associated memories.
Disclosed herein is a method of determining a full warp map that relates a corrupted image to a reference image. The method includes the steps of: coarsely aligning said corrupted image against said reference image by a rotation-scaling-translation transformation; using a theoretical variance bound to determine locations and sizes of a plurality of correlatable image patches from said reference images; and determining said warp map based on said patches.
In one embodiment, said theoretical variance bound is a Cramer-Rao lower bound of two-dimensional shift estimation.
In another embodiment, the variance bound is computed from a linear combination channel from the reference image.
In a further embodiment, the images are obtained from the group of images consisting of: (a) digital images of a document; (b) a video sequence that contains independently moving objects; (c) digital images of a scene with overlapping content; and (d) a digital image of a test chart.
It is apparent from the above that the arrangements described are applicable to the computer, data processing, and image processing industries.
The foregoing describes only some embodiments of the present invention, and modifications and/or changes can be made thereto without departing from the scope and spirit of the invention, the embodiments being illustrative and not restrictive.
Number | Date | Country | Kind |
---|---|---|---|
2007240236 | Dec 2007 | AU | national |
Number | Name | Date | Kind |
---|---|---|---|
5611000 | Szeliski et al. | Mar 1997 | A |
5649032 | Burt et al. | Jul 1997 | A |
5987164 | Szeliski et al. | Nov 1999 | A |
6044181 | Szeliski et al. | Mar 2000 | A |
6097854 | Szeliski et al. | Aug 2000 | A |
6173087 | Kumar et al. | Jan 2001 | B1 |
6301377 | Taylor, Jr. | Oct 2001 | B1 |
6571024 | Sawhney et al. | May 2003 | B1 |
6711293 | Lowe | Mar 2004 | B1 |
6738532 | Oldroyd | May 2004 | B1 |
6865011 | Whitehead et al. | Mar 2005 | B2 |
7269299 | Schroeder | Sep 2007 | B2 |
8090218 | Larkin et al. | Jan 2012 | B2 |
20050238198 | Brown et al. | Oct 2005 | A1 |
20070122060 | Hardy et al. | May 2007 | A1 |
20080019611 | Larkin et al. | Jan 2008 | A1 |
Number | Date | Country |
---|---|---|
2007-505330 | Mar 2007 | JP |
2005010604 | Feb 2005 | WO |
2005096218 | Oct 2005 | WO |
Number | Date | Country | |
---|---|---|---|
20090148051 A1 | Jun 2009 | US |