1. Field of Invention
Embodiments of the invention relate generally to image transformations. More specifically, at least one embodiment relates to methods for determining a transformation that aligns two images.
2. Discussion of Related Art
Image alignment or registration is employed in a wide variety of fields including cartography, medical imaging systems and machine vision systems to name a few. In general, proper alignment is achieved by identifying a transformation of an image relative to the image with which it is being aligned. Image alignment or registration may be used in various applications. For example, two images of the same subject may be “overlaid” on one another once they are aligned. In addition, adjacent images of a common subject may be aligned to generate a large image from a plurality of adjacent images. Further, image alignment may be employed to determine a change in position of the subject of the image or the camera that is generating the images.
Image registration may be used to evaluate a first image when compared with a second image of the same subject or to generate a “fused” image that combines features from a plurality of images. In cartography multiple images may be combined to create a fused image that combines features of images that are properly aligned, for example, a satellite image may be overlaid with an image of a topographic map. In medicine, information from a first imaging system (an MRI system) may be combined with information from a second imaging system (a computer tomography system) to assist in better monitoring a patient's condition.
In a further medical imaging example, processes may also employ image registration to align two images which are acquired as follows: a first image acquired before injection of a colored liquid; and a second acquired after injection of the liquid. A subtraction operation is then performed on the images to highlight colored features (e.g., veins). Image registration is employed prior to the subtraction operation to account for movement of the subject.
Image registration may also be used with an “image stitching” or “image mosaicing” operation in which multiple images of a scene are taken from various positions and/or angles. The images are registered and then combined to form a single larger and/or higher resolution image. For example, these processes may be employed to: in cartography, combine aerial photographs of land; to create a panoramic image; in astronomy, to create a large image of the sky; or to combine multiple smaller images adjacent one another to construct a large image.
Further uses of image registration processes include: camera calibration, in which camera distortion parameters (e.g., barrel or pin-cushion distortions) are estimated so that they may be compensated for; in “super-resolution,” where a high resolution image of a scene is generated using multiple lower resolution images of the same subject which are precisely aligned; and to determine a position of a camera with respect to an object having a known position. The preceding are just some of the uses for an image registration process that aligns two or more images. The utility of image registration continues to expand with the continued increase in the capabilities of both image capture systems and systems used for image processing/analysis.
As illustrated in
Image alignment can be performed on a first and a second image using a transformation process whereby a first image is “moved” through a variety of positions/changes in orientation relative to a second image. That is, at least one of the images undergoes a geometrical transformation, e.g., warping. The transformation process may vary the parameters (“transformation parameters”) employed to move the first image through a variety of positions/orientations until a close (e.g., an identical match) or otherwise optimal alignment is achieved between the first image and the second image. A transformation can be represented using various parameters. For example, a transformation can be represented using parameters for each of a: translation in the x direction; translation in the y direction; rotation; scale change in the x direction; scale change in the y direction; perspective change in the x direction; perspective change in the y direction; and shearing.
Generally, a transformation is performed as a mathematical operation in which the coordinates in the first image are transformed into corresponding coordinates in the second image or vice versa. As explained below, transformation processes may operate on image coordinates represented in a vector format.
In the case of two or more images that are not aligned, the images will have different domains and consequently a different set of coordinates. In
Various prior art approaches have been proposed to find a transformation that aligns two images. Some of these approaches employ a process that includes a minimization algorithm to minimize the difference between the images to find the best alignment, that is, to determine the set of transformation parameters that most closely align the images. However, these prior art approaches employ the intensity level difference on a pixel by pixel basis across the entire image to determine how well the first and second image are aligned, that is, how closely the intensity levels of the images are matched. Because these approaches are not effective unless the user has a very good initial estimate of the transformation parameters, these approaches generally reduce the level of detail in the images (i.e., smooth the images) before being processed by the algorithm to allow the algorithm to converge to the correct solution. As a result, these approaches may eliminate details that would have been useful in properly aligning the images.
Other approaches evaluate the registration of images by evaluating regions within the images. These approaches attempt to minimize a distance between matched regions. As a result, proper alignment is determined based on the separation distance. However, these approaches fail to take into account characteristics of a match between regions. For example, these approaches do not address the quality of a region-match nor its strength as a first region is shifted in any one of a plurality of directions relative to a second region with which it is being matched.
Current approaches also fail to provide a method to reduce or minimize the quantity of regions that should be matched to provide an optimal alignment between images.
Embodiments of the invention provide a method for image processing that may optimize a transformation that aligns images by selecting a region in a first image and a region that corresponds to the selected region in a second image. In addition, embodiments of the invention provide a method of selecting regions for alignment according to mathematical characteristics of the score function calculated from a set of previously selected regions. Further embodiments may determine an alignment based on a score function calculated for a few matched pairs of corresponding regions to increase the accuracy of the alignment and eliminate the need to smooth the images.
In one aspect, the invention provides a method of determining a transformation between a first image and a second image. In one embodiment, the method includes acts of, for at least one region in the first image, determining a corresponding region in the second image as a function of a transformation parameter and determining a similarity function as a function of the transformation parameter between the region in the first image and the corresponding region in the second image. In one embodiment, the method further includes acts of determining a similarity function between the first image and the second image using the similarity function for the at least one region, and determining a value of the transformation parameter, wherein the value substantially optimizes the similarity function between the first image and the second image.
In a further aspect, the invention provides, in a computer system having a display, a method of determining a transformation between a first image and a second image where the method includes acts of, for at least one region in the first image, determining a corresponding region in the second image as a function of a transformation parameter and determining a similarity function as a function of the transformation parameter between the region in the first image and the corresponding region in the second image. In one embodiment, the method further includes acts of determining a similarity function between the first image and the second image using the similarity function for the at least one region, determining a value of the transformation parameter, wherein the value substantially optimizes the similarity function between the first image and the second image, and displaying to a user a result of a transformation of at least one of the first image and the second image according to the value of the transformation parameter.
In another aspect of the invention, a computer readable medium is encoded with a program for execution on a processor, the program, when executed on the processor performing a method of determining a transformation between a first image and a second image, the method comprising acts of, for at least one region in the first image, determining a corresponding region in the second image as a function of a transformation parameter and determining a similarity function as a function of the transformation parameter between the region in the first image and the corresponding region in the second image. In one embodiment, the method further includes acts of determining a similarity function between the first image and the second image using the similarity function for the at least one region, and determining a value of the transformation parameter, wherein the value substantially optimizes the similarity function between the first image and the second image.
The accompanying drawings are not intended to be drawn to scale. In the drawings, each identical or nearly identical component that is illustrated in various figures is represented by a like numeral. For purposes of clarity, not every component may be labeled in every drawing. In the drawings:
This invention is not limited in its application to the details of construction and the arrangement of components set forth in the following description or illustrated in the drawings. The invention is capable of other embodiments and of being practiced or of being carried out in various ways. Also, the phraseology and terminology used herein is for the purpose of description and should not be regarded as limiting. The use of “including,” “comprising,” or “having,” “containing”, “involving”, and variations thereof herein, is meant to encompass the items listed thereafter and equivalents thereof as well as additional items.
Vectors referred to herein are represented in bold text.
As employed herein, in various embodiment, an “image” can be defined as a set of positional coordinates in a coordinate system in which each positional coordinate has one or more associated attributes. For example, an image can consist of a binary, monochrome, or color image, such as that acquired by a camera (e.g., acquiring visible, X-ray, infrared light, etc.) or by a radar or constructed synthetically. As described previously, in the case of a grayscale image, a positional coordinate has an associated grayscale intensity value I(x). In the case of a color image, a positional coordinate has several attributes representing, for example, the intensities of each of the colour bands (e.g. Red, Green, Blue) or the intensities of the luminance and chrominance components (e.g., YC). In this case, a positional coordinate has a vector-valued intensity I(x), e.g., I(x)=[R(x), G(x), B(x)].
Furthermore, an image can also be the result of applying any number of image processing steps to a prior image. For example, the image can be the result of an edge detection process. In this example, the “edge image” can include a set of positional coordinates corresponding to edge points, called “edge elements” or “edgels”. The edge elements can include one or more attributes relating to the gradient of the edge element (e.g., intensity, angle and polarity), one or more attributes expressing a relationship between this edge element and adjacent edge points, etc.
Referring to
According to one embodiment, the image IAonB can be obtained using a mapping function xA=fBA(xB). The function fBA includes a position xB in the coordinate system BB of image IB as a variable and returns a position xA in the coordinate system BA of image IA as a solution. The mapping function can be written in non-vector form as:
xA=fxBA(xB,yB) Eq. 1
yA=fyBA(xB,yB) Eq. 2
In a further embodiment, each position xB belonging to the domain DB of image IB is selected, in turn, to obtain the warped image IAonB. The selected position xB is mapped to a position xA in the coordinate system BA of image IA using the mapping function xA=fBA (xB).
IAonB(xB)=IA(xA)=IA(fBA(xB)) Eq. 3
If the mapped position xA falls outside the domain DA of image IA, i.e., where IA(xA) is undefined for xA, then the value of the image IAonB at position xB is undefined. Referring again to
According to one embodiment, the geometric transformation (e.g., the warp operation) applied to the image IA to generate IAonB(xB) can be written as a function of a vector of transformation parameters, α=(α0 α1 . . . αN-1)T. For example, a translation can be written as a function of a transformation vector α=(αx αy)T, where αx and αy are the displacements in the x and y directions, respectively. In a further embodiment, a third parameter αθ can be employed to represent a rotation. According to one embodiment, a “full perspective transformation” employs a total of N=8 parameters. As mentioned above, these parameters may include: translation in the x direction; translation in the y direction, rotation; scale change in the x direction; scale change in the y direction; perspective change in the x direction; perspective change in the y direction; and shearing. In various embodiments, the transformation vector α need not include all of these parameters and may include any one alone, any combination of two or more of the preceding parameters or any combination of the preceding parameters and other parameters. Thus, the preceding list of parameters is not restrictive, and in other embodiments, the transformation parameters can include one or more parameters other than those listed here. The parameters that are employed may, in one embodiment, be determined based on the application with which the transformation process is employed.
According to one embodiment, the mapping function xA=fBA (xB) and the image IAonB(xB) are functions of the transformation vector α, where:
xA=fBA(xB,α) Eq. 4
IAonB(xB,α)=IA(fBA(xB,α)) Eq. 5
According to this embodiment, IAonB(α) represents the image obtained by transforming the image IA onto the domain DB of the image IB according to a transformation vector α.
In accordance with another embodiment, a similarity measure G may be employed to determine the similarity between two images defined over a common domain. According to one embodiment, a similarity measure is used to determine how close in appearance two images are to one another. Embodiments of the invention may employ, for example, similarity measures that include a Normalized Grayscale Correlation (NGC) or other approaches that are known by those of skill in the art.
In other words, in one embodiment, the similarity measure G(IA, IB, α) measures the similarity between the image portions IAonB
As described above, in one embodiment, the similarity measure G(IA, IB, α) determines the similarity as a function of the transformation vector α. Accordingly, a change in one or more of the parameters included in the transformation vector α may result in a change of the value of the similarity measure G(IA, IB, α). In accordance with a further embodiment, a process determines a value αMAX of the transformation vector α=(α0 α1 . . . αN-1)T that maximizes the similarity measure G(IA, IB, α) measured between the image IB and the image IAonB(α). In a version of this embodiment, the image IAonB(α) is obtained by warping an image IA onto the domain DB of the image IB according to the transformation vector α. Accordingly, for given images IA and IB, the similarity measure G(IA, IB, α) changes according to change in one or more of any of the parameters α0, α1, . . . αN-1 in α. That is, the similarity measures varies when there is, for example, a change in the translation in the x direction, a change in translation in the y direction, a change in rotation, or any other changes that are described by the transformation vector α.
At act 100, the transformation vector α is initialized to an estimated value. According to one embodiment, the estimated value provides a starting point that allows the process to more rapidly converge to the value αMAX. The estimated value can be acquired in a number of ways, for example, it can be provided by the user, the value can be an output of a registration process that is less accurate than that provided by the process 60 (e.g., it can be the result of a “rough” registration process), or the value can be determined based on known positions of the cameras that acquired the images, etc. That is, according to one embodiment where a camera takes two photos of the subject the estimated value may be determined based on a known amount of panning of the camera between the acquisition of the first image and the acquisition of the second image. The panning may result in the two images being offset by a certain translation, rotation, etc.
At act 200, the image IAonB(α) obtained by the mapping the image IA onto the domain DB of the image IB according to the current value of the transformation vector α is determined. Thus, for the first iteration (and possibly the only iteration), the image IAonB(α) is determined using the estimated value. According to one embodiment, the image IAonB(α) is determined by the previously-described mapping function. That is, IAonB(xB, α) is determined for each position xB in the domain DB, as shown in Equation 5, above.
As mentioned above, the transformation vector α can include a plurality of transformation parameters, i.e., α=(α0, α1, . . . αN-1)T. These parameters may be represented as the parameters αK where a change in a parameter is represented as ΔαK. In various embodiments, the transformation vector should be considered as variable since it is changed to reach a better value αMAX. When it is considered as a variable, it is noted α′=(α′0, α′1, . . . α′N-1)T. The change between the variable α′ and the current value of α is noted Δα=α′−α=(Δα0 Δα1 . . . ΔαN-1)T.
According to one embodiment, at act 300, first, a similarity function, G(IA, IB, α′), is determined. This function approximates, as a function of the transformation vector α′, the similarity between the image IB and the warped image IAonB(α′), where the image IAonB(α′), is obtained by warping the image IA according to the changed transformation vector α′. In a further embodiment, the value α′MAX of the transformation vector α′ that maximizes the similarity function G(IA, IB,α′) is determined. In some embodiments, act 300 may be replaced by a plurality of separate acts, for example, the act of determining the value α′MAX may be included as a separate act.
As mentioned above, in some embodiments, the process 60 may include one or more iterations where each iteration generates the image IAonB(α) for the current value of the transformation vector α. The determination of whether to repeat the acts 200 and 300 may be made by a user familiar with the overall process with which the process 60 is employed, e.g., the user may determine that the estimated value of the transformation vector α is suitable to allow the similarity function G to converge without iterating. Alternatively, an act 400 may be included in some embodiments to automatically determine whether further adjustment to the transformation vector α and corresponding repetition of acts 200 and 300 is desirable.
At act 400, it is determined whether acts 200 and 300 should be recalculated with a new value of the transformation vector α. According to one embodiment, at act 400, it is determined whether the transformation vector α has converged. One test of convergence is to check whether the modified transformation vector α′MAX is closer to the current transformation vector α than a predefined threshold value, ΔαT. In one embodiment, this threshold value can be determined by evaluating the changes in the transformation parameters that cause a displacement of the content of image IAonB(α) smaller than predefined distance expressed in pixels, for example less than one pixel. The user familiar with the overall process with which the process 60 is employed may determine the level of precision that is required when calculating the position of one image relative to the other.
According to one embodiment, if acts 200 and 300 are not to be repeated, the process moves to act 500 where the value αMAX sought by the process 60 is set to the value of the transformation vector α′MAX determined at act 300. In accordance with one embodiment, act 500 completes the process 60, e.g., the value of the transformation vector α that best aligns the subject images is known.
According to one embodiment, if acts 200 and 300 are to be repeated, at step 600, the transformation vector α is changed for the transformation vector α′MAX determined at step 300, and the process 60 returns to step 200.
Various embodiments of the invention employ the transformation vector to determine a similarity measure on a region by region basis to determine the match between the image IB and the image IAonB(α), that is, as the image A is transformed onto the domain DB with the transformation vector α.
In accordance with one embodiment, for each region Ri, a similarity measure Gi(IAonBi(α), IBi) (e.g., a “local” similarity measure) is determined between the image portions IAonBi(α) and IBi. Although at least three similarity measures (i.e., G1, G2, G3) may be determined concerning the image regions illustrated in
In a further embodiment, the similarity measure G(IA, IB, α) (e.g., a “global” similarity measure) is determined by summing the similarities Gi(IAonBi(α), IBi) obtained for each of the regions Ri:
where NR is the total number of regions Ri.
According to one embodiment, at act 310, a quantity NR of regions Ri is established. According to one embodiment, a fixed quantity, for example of 20 regions is established. In another embodiment, the quantity of regions is determined by calculating the quantity of regions that would cover a fixed fraction of the image domain. In still another embodiment, a fixed maximum quantity of regions is set, for example 20, but this total can be lowered in the further described step 321 if the calculation of the value of α′MAX has less than a predefined amount of uncertainty.
At act 320, a rectangular region Ri in the domain DB of image IB is selected in one embodiment, for example, the region IB3 illustrated in
According to one embodiment, for each region Ri, the effect of the change Δα a of the transformation vector α′ is approximated by a translation represented by a translation parameter Δxi. That is, although the transformation vector α may include transformation parameters concerning rotation, scale, etc., the resulting “movement” of individual regions IAonBi(α), for different values of the transformation vector α′ may, in one embodiment, be approximated by purely translation movement. This approach may be referred to as “approximation by translation.”
In accordance with one embodiment, at act 330, a similarity function Gi(Δxi) expressed in terms of a translation vector Δxi, for region Ri is determined. This “local or region-based” similarity function Gi(Δxi) approximates the behaviour of the similarity between the image IB and the image IAonB(α) inside region Ri, as the image IAonB(α) is translated by the translation vector Δxi.
In one embodiment, at act 340, the translation parameter Δxi is expressed in terms of the transformation change vector Δα. As a result, in this embodiment, a similarity function Gi(α′=α+Δα) (e.g., a local similarity function) for region Ri as a function of the transformation change vector α′, is determined. The similarity function Gi(α′) approximates the behaviour of the similarity between the image IB and the image IAonB(α′) inside region Ri.
According to one embodiment, at step 345, the similarity function Gi(α′) determined at step 340 is added to a sum of the similarity functions Gi(α′) determined for each of the processed regions Ri. That is, according to one embodiment, the similarity function Gi(α′) is determined in turn for a plurality of regions R1, R2, and R3. Where, for example, the regions are processed in numerical order: a first similarity function G1(α′) is determined for a first region—the value of the first similarity function G1(α′) is the total of the sum at act 345 at this stage of processing; a second similarity function G2 (α′) is determined next for a second region and is added to the previously determined sum—the value of the first similarity function plus the value of the second similarity function is the total of the sum at act 345 at this stage of processing; and a third similarity function G3 (α′) is determined next for a third region and is added to the previously determined sum—the total of the sum at act 345 at this stage of processing is the sum of the value of the first similarity function, the value of the second similarity function and the value of the third similarity function.
According to one embodiment, at act 350, if all NR regions Ri have been processed, the process proceeds to act 360. Alternatively, where a similarity function Gi(α′) has not been determined for all the regions selected for processing the process returns to act 320.
In accordance with one embodiment, at act 360, the similarity function G(IA, IB,α′) (e.g., the global similarity function) is established as the final value of the sum:
That is, once all the desired regions have been processed, the similarity function G(IA, IB,α′) is known for a particular value of the transformation vector. According to one embodiment, the value of the transformation vector α is known, so the “global” similarity function is strictly a function of the transformation vector α′ and, therefore, denoted G(α′). At act 360, the value α′MAX of the transformation change vector α′ that maximizes the generated similarity function G(α′) is determined.
Thus, at act 360, in one embodiment, a similarity function for a first value of the transformation vector is known. Further, a value of the transformation vector that maximizes the similarity function for that first value of the transformation vector is determined at act 360 in accordance with one embodiment.
In accordance with one embodiment, each succeeding value of the transformation vector is established by replacing the value of the immediately preceding transformation vector a by the transformation vector α′MAX that maximizes the similarity function for the preceding value of the transformation change vector. The acts described for the process 70 may be repeated for each succeeding value of the transformation vector, if necessary.
In accordance with one embodiment, further details concerning the act 330 are illustrated in
As is explained below, embodiments of the process 80 determine the similarity function Gi(Δxi) by applying a plurality of translation vectors to the region Ri. Each translation vector applied to the region Ri is noted Δxij. That is, for a first region R1, the translation vectors Δxij are applied to determine the similarity function G1(Δx1j). According to one embodiment, a total of M translations Δxij is established; therefore, for a first region R1, the translation vectors Δx11 to Δx1M are applied to determine the similarity function G1(Δx1j). In a further embodiment, each translation can be measured by the quantity of pixels by which the region Ri is shifted in either or both of the x direction and the y direction. In accordance with one embodiment, each translation Δx1j, where j=1 to M, provides a shift in at least one of the x direction and the y direction by an integer number of pixels. In a further embodiment, the translation vectors are chosen to include all the vectors that contain integer values for the shifts in x and y directions that are between a minimum (possibly negative) and a maximum value.
According to the above described embodiment, the region Ri moves relative to the image IAonB(α) according to the change in position resulting from each translation. As a result, in one embodiment, the length of each translation determines the distance at which the region Ri moves relative to the image IAonB(L).
According to one embodiment, the maximum length of the translations Δxij is chosen based on an estimate of the uncertainty (e.g., provided by the user) associated with the estimated value of the transformation vector α which is initially employed. For example, the camera that has acquired the image has moved by a maximum distance between the acquisitions of the two images, this maximum distance can be transformed into a maximum translation between the corresponding positions in the two images.
Referring further to
In accordance with one embodiment, at act 334, the similarity measure Gi(Δxij) (e.g., the local similarity measure) between the image IAonBi(α, Δxij) and IBi is determined. In one embodiment, a Normalized Grayscale Correlation (NGC) is used to determine the similarity measure Gi(Δxij) measured between the two image portions. Other approaches may be employed, however, to determine the similarity measure Gi(Δxij) measured between two image portions. As one example, the similarity measure Gi(Δxij) can be determined using other similarity measures that can be calculated locally, e.g., the sum of absolute differences. Preferably, the type of similarity measure is chosen based on the type of the images (e.g., grayscale image, edge image) and/or characteristics of the images.
For example:
For two grayscale images acquired by different types of sensors (e.g., radar and visible), a similarity measure known as “Mutual Information” is a suitable similarity measure.
According to one embodiment, at act 336, a determination is made whether each of the total of M translations have been processed. According to this embodiment, the process 80 proceeds to act 338 if all M translations Δxij have been processed. Alternatively, the process 80 returns to step 332, at which another translation Δxij is selected. As mentioned above, in one embodiment, the acts 332 and 334 are repeated for all of the total of M translations.
According to one embodiment, a similarity measure vector Gi(Δxij), can be defined for the region Ri once all the translations Δxij have been processed. In a version of this embodiment, the similarity measure vector Gi(Δxij) is determined following act 336. In a further embodiment, the similarity measures Gi(Δxij) is determined for each of the total of M translations Δxij calculated for the region Ri. As illustrated in
One of skill in the art will recognize that Equation 8 describes a discrete, non-parameterized version of the similarity function Gi(Δxi) (e.g., the local similarity function for the region Ri determined by the process 80 in accordance with one embodiment.)
In one embodiment, a maximum (or peak) in the similarity measure vector Gi(Δxij) for the region Ri is determined at step 338, e.g., a “local” maximum or peak is determined. This maximum similarity measure vector Gi(Δxij) can be determined in a number of ways. For example, the absolute maximum (or highest peak) can be determined. In this embodiment, the translation vector Δxij that generates the maximum is identified as Δx0i=(Δx0i Δy0i)T. The similarity measure vector Gi(Δx0i) that corresponds to the maximum can be identified as G0i.
In accordance with a further embodiment, at act 339, the similarity measure vector Gi(Δxi) is approximated by paraboloid centered at the selected peak. According to one embodiment, the similarity function is approximated using a paraboloid form around the chosen peak as follows:
Gi(Δxi)=G0i+a(Δxi−Δx0i)2+2b(Δxi−Δx0i)(Δyi−Δy0i)+c(Δyi−Δy0i)2 Eq. 9
Equation 9 can, with matrix notation, be expanded and described as:
Gi(Δxi)=G′0i+2biTΔxi+ΔxiTAiΔxi Eq. 10
where,
The resulting matrix Ai is a 2×2 symmetric matrix with negative or zero eigenvalues because the similarity function decreases as Δxi moves away from the peak value G0i at Δx0i.
In one embodiment, the determination of the parameters Ai and bi describing the paraboloid can be done by calculating the values of the similarity function Gi(Δxi) at few translations around the chosen peak (per example, in translations in 8 directions, all 1 pixel away) and by calculating the best paraboloid using a least square method. In a further embodiment, the eigenvalues of Ai should be limited to negative or zero. In another embodiment, the paraboloid parameter Ai can be extracted from the similarity measure of the region Ri with a translated version of itself. In another embodiment, paraboloid parameters Ai and bi can be modified to reduce the curvature of the paraboloid in the direction of the other peaks (not selected) to represent the uncertainty of the position in this direction. One skilled in the art will appreciate that, although a paraboloid is used to approximate the similarity function Gi(Δxi), various other types of parametrized functions can also be used. For example, a Gaussian function.
In accordance with another embodiment, the act 340 includes an act of determining the translation vector Δxi, for region Ri, as a function of the transformation change vector Δα as mentioned above. Referring to
As illustrated in
Δxi=x′Bi−xBi Eq. 12
As mentioned above, the translation resulting from the translation vector ΔxBi may be employed to approximate the effect of a change in one or more transformation parameters α of the transformation vector α regardless of whether the change includes any translation, that is, regardless of whether the change includes a change in translation in either the x or y direction.
In accordance with one embodiment,
According to one embodiment, the actual value of the translation vector Δxi (used to approximate the effect of the transformation change vector Δα on a particular region Ri) can be determined using the following equation:
Δxi=x′Bi−xBi=fAB(fBA(xBi,α),α+Δα)−xBi Eq. 13
In a further embodiment, the value of the translation vector Δxi can be determined by making another approximation. That is, according to one embodiment, the value of the translation vector Δxi is determined according to the following product:
Δxi=J(xBi,α)Δα=JiΔα Eq. 14
where J(xi, α), or simply Ji, denotes the value of a Jacobian vector J(x, α) at position xBi of the image IAonB(α) for the current value of the transformation vector α.
The Jacobian vector J(x, α) is defined as shown in Equation 15:
The first row of the Jacobian vector J(x, α) contains the partial derivatives, a ∂fxAB(fBA(x, α), α′)/∂α′k, of the x-component of the function fxAB(fBA(x, α), α′) with respect to each of the transformation parameters α′k of the transformation vector α′=(α′0 α′1 . . . α′N-1)T. Each partial derivative approximates the displacement Δx caused by a small change in the respective parameter α′k of the transformation vector α′=(α′0 α′1 . . . α′N-1)T. The second row contains the partial derivatives of the y-component. As illustrated in
Referring again to
−Δxi=−J(xBi,α)Δα=−JiΔα Eq. 16
In one embodiment, the value J(xBi, α) (given by equation 3) can be calculated according to the following steps:
1) Calculate the derivatives of the transformation functions fxBA and fyBA according to every transformation parameters α′k;
2) Calculate the position xAi in image IA by replacing the center of the region xBi and the current value of the transformation vector α in the transformation function: xAi=fBA(α, xBi);
3) Replace the calculated position xAi in image IA and the current value of the transformation vector α in the calculated derivatives of the transformation functions fxBA and fyBA.
Finally, substituting the translation parameter Δxi in the equation of the similarity function Gi(Δxi) determined at step 330 yields the similarity function Gi(α′) (e.g., the local similarity function) expressed in terms of the transformation change vector Δα:
According to a further embodiment, all the local similarity measures are summed to determine the global similarity measure as follows:
According to one embodiment, the value α′MAX that optimizes the similarity measure G(α′) is:
α′MAX=α−A−1b Eq. 19
According to one embodiment, the region is selected from among the set of candidate regions because it is the region that will reduce the uncertainty by the greatest amount (e.g., as measured by the indicators determined for each of the candidate regions). In various embodiments, other factors can be considered when selecting a region, for example, the contrast level in the candidate region and/or whether the candidate region contains objects. That is, it may be beneficial to select a region with a high contrast level and one that does not have a uniform appearance (e.g., does not consist solely of a solid or textured background).
where I is the number of regions Ri that have already been processed. The similarity function GI(α′) can be viewed as an “intermediate” version of the “final” similarity function G(α′) that is obtained once all the regions Ri have been processed. Accordingly, where a set of NR regions is selected and processed, the “final” similarity function G(α′)=GI(α′) results upon the processing of the Nth region, e.g., where I=N. Further, according to this embodiment, it is the “final” similarity function G(α′) for which the vector α′MAX is determined at act 360.
In accordance with one embodiment, at act 321, the transformation parameter or combination of transformation parameters α′k having the greatest associated uncertainty are determined based on the similarity function GI(α′) determined at act 345 of the previous iteration in which the acts 320 to 345 are performed. For example, when the regions R1 and R2 have already been processed, the step 320 proceeds to the selection of the third region and act 321 is performed based on the similarity function G2 (α′). As is described in greater detail below, the identification of the transformation parameter(s) having the greatest associated uncertainty is achieved by determining the direction vmin in the space α′ in which the slope of the similarity function GI(α′) around its peak is the smallest.
According to one embodiment, at act 360 of the process 70, it may be desirable to provide a similarity function G(α′) for which the position α′MAX of the peak can be accurately located; that is, it may be desirable to provide a similarity function G(α′) whose value decreases sharply as the parameter α moves away from the value α′MAX. According to a further embodiment, it is desired to provide a similarity function G(α′) whose value decreases sharply as the parameter α moves away from the value α′MAX in all directions in the space α′.
Accordingly, in one embodiment, the process 90 identifies the transformation parameter(s) having the greatest associated uncertainty in the peak position in the space of transformation vector α′ of the similarity function G(α′). Returning to the similarity function GI(α′), it has a maximum value at position αI′MAX in the space of α′. The direction in the space α′ for which the similarity function GI(α′) decreases the most slowly as the parameter α′moves away from the value αI′MAX (i.e., has the smallest slope) identifies the parameter or combination of parameters α′k having the greatest associated uncertainty. In other words, if processing the remaining regions does not sharpen the peak (i.e., increase the slope) in this direction of the space α′, there will be large uncertainty about the exact location of the peak in this direction in the final similarity function G(α′) (e.g., the final global similarity function).
Accordingly,
In one embodiment, at act 322, a set of candidate regions Rk are acquired. In one embodiment, these candidate regions are extracted from a regular grid of positions on the image IB. In the illustrated embodiment, each of the candidate regions Rk is processed in the process 90.
According to a further embodiment, a candidate region Rk is selected at act 323.
In one embodiment, at act 329, an indicator of how the processing of the selected candidate region Rk will reduce the uncertainty associated with the parameter or combination of parameters determined at act 321 is determined. That is, an indicator is determined concerning how the processing of the selected candidate region will increase the slope of the similarity function G′(a′) around its peak in the direction vmin of the space α′.
According to a further embodiment, at act 327, if all candidate regions Rk have been processed, the process proceeds to act 328; otherwise, the process returns to act 323 where another of the candidate regions is selected.
In one embodiment, once all candidate regions Rk have been processed, at act 328, a region among the candidate regions is selected based on the indicators determined for each of the candidate regions. That is, if no other factors are considered when selecting a candidate region, the candidate region having the best indicator (e.g., the highest increase of slope) is selected because that region (among the set of candidate regions) will best assist in determining the location of the peak in the similarity function G(α′). According to one embodiment, the process 90 proceeds to act 330 (see
Returning to step 321, as previously described, the similarity function GI(α′) is given by:
Based on Equation 21, in one embodiment, the direction vmin in the space α′ in which the slope of the similarity function GI(α′) around its peak is the smallest can be determined in two primary steps. First, the matrix AI is diagonalized. In a version of this embodiment, the matrix AI is diagonalized according to the method described in William H. Press et al, Numerical Recipes in C: the art of scientific computing, Second Edition, Cambridge University Press, 1992, pp. 456-495. In one embodiment, the result is a set of eigenvectors vh, each having a corresponding eigenvalue λh. Then, the eigenvector vmin whose eigenvalue λmin is the closest to zero is selected. The eigenvector vmin=[v0, v1, . . . vN-1] identifies the direction in the space α′ of greatest uncertainty:
v0α′0+v1α′1+ . . . +vN-1α′N-1=0 Eq. 22
According to this embodiment, at act 325, the direction vmin in the space α′ for which the similarity function GI(α′) has the smallest slope is expressed as a direction Δxmin in the space Δxk:
Δxkmin=J(xBk,α)vmin=Jkvmin Eq. 25
where J(xBk, α), or simply Jk, denotes the value of the Jacobian vector J(x, α) at position xBk of the warped image IAonB(α) for the current value of the transformation vector α.
At act 326, the slope of the estimated similarity function G′k(Δxk) in the direction Δxkmin is determined in accordance with this embodiment. The slope may be given by:
ΔG′k=√{square root over (vminTJkTAkJkvmin)} Eq. 26
The slope ΔG′k can then be used as the indicator for candidate region Rk concerning the effect of the region Rk on the similarity function. Although the slope is chosen as an indicator in this example, other embodiments may employ a different indicator of the usefulness of the estimated similarity function G′k (Δxk). For example, in one embodiment, the square of the slope in the direction Δxkmin is used as the indicator.
Although embodiments described herein include one or more of the processes 60, 70, 80, 90 and 92, in some embodiments, these processes may be replaced by variations of these processes or by processes that are substantially different than the processes 60, 70, 80, 90 and 92.
In various embodiments, the process 60 may include the selection of regions in both images IA and IB, rather than only the selection of regions in image IB. In these embodiments, each region Ri can be selected in either image IA or IB, and is used to generate a similarity function Gi with regard to the other image, respectively with IB or IA. Further embodiments, can combine all of these similarity functions Gi into a “global” similarity function G(α′) that takes into account the matches between the images in both directions of the transformation.
Further, various embodiments of the process 70 can select regions at act 320 that include shapes that differ from the shape of the rectangular region described above. For example, one embodiment can select a shape that will remain in the defined domain DB
Various embodiments of the invention can accomplish the acts described in processes 60, 70, 80 and 90 in a different order than those described previously. For example, one embodiment can calculate the estimated similarity functions G′k (e.g. the auto-correlation) for all candidates regions Rk before starting the loop of process 60. The estimated similarity functions G′k can be reused in all iterations of the process 60. Also, in one embodiment, the Jacobian quantities J used at acts 340 and 329 can be calculated for all candidate regions Rk before starting the loop of process 60 and be approximated as constant over the various iterations of the process 60.
As used herein, the term “similarity function” refers to any function that provides an indication of the similarity between images or one or more regions of images. Thus, in various embodiments of the invention can use a definition of the similarity function G that is smaller when measured on a good match between images or regions of images, for example the sum of absolute difference between pixels. In these embodiments, the process described will look for minimum peaks of the similarity functions Gi(α′) and will find a minimum of the similarity function G(α′) (e.g., the minimum of the global similarity function).
Other variations of the processes may be employed in various embodiments of the invention. For example, the act 200 of determining an image IAonB(α) for the current value of the transformation vector α may be skipped in one embodiment, in particular, where the initial value of the transformation vector α is the identity transformation. One embodiment can also measure the similarity between regions Ri in image IB and regions directly in the image IA by first transforming the shape of region Ri to take into account the transformation as defined in the current value of the transformation vector α.
Various embodiments may be employed with all types of image data and image acquisition systems. For example, the images may originate in a black and white format, a color format, as 2-D images, as 3-D images, as photographs or as video image data. Embodiments may be employed with image data in an analog format or digital format. Embodiments may be employed with various types of medical imaging systems, machine vision systems, security systems, video production systems, cartography systems, geological imaging systems, astronomical imaging systems and others that provide image data.
As described herein, various embodiments may provide an algorithm or set of algorithms which may be employed in an image processing system, i.e., which may be processed to align (e.g., register) two or more images or portions of images. According to one embodiment, the various algorithms described herein are included in an imaging processing system as one or more image processing software modules. For example, one or more of acts 100, 200, 300, 400, 500, and 600 of the process 60 may be included in one or more software modules that process image data. In a version of this embodiment, the act 300 may be provided in a plurality of software modules, for example, two or more software modules corresponding to one or more of acts 310, 320, 330, 340, 345, 350 and 360 of the process 70.
Similarly, various acts included in the processes 80, 90 and 92 may also be provided as one or more image processing software modules. In a further embodiment, these software modules are employed in an image processing system that receives a first image (e.g., the image A) and a second image (e.g., the image B) and determines a transformation between the two images. That is, the software modules may operate to process the image data associated with a plurality of images and determine a transformation (based on one or more transformation parameters) that optimally aligns the images.
Accordingly, various embodiments of the invention may provide the software modules as part of a computer readable medium such as a portable storage medium (e.g., a disk) or a “fixed” medium (e.g., a hard drive). According to a further embodiment, the computer readable medium is accessible via the Internet. In a version of this embodiment, one or more of the software modules may be downloaded to a remote user via an Internet connection.
In other embodiments, the processes described herein may be implemented fully or partially in hardware.
Any of the above-described embodiments, may be included in a computer system. The computer system may be, for example, a general-purpose computer such as those based on an Intel PENTIUM®-type processor, a Motorola PowerPC® processor, a Sun UltraSPARC® processor, a Hewlett-Packard PA-RISC® processor, or any other type of processor. Such a computer system generally includes a processor connected to one or more memory devices, such as a disk drive memory, a RAM memory, or other device for storing data. The memory is typically used for storing programs and data during operation of the computer system. Software, including programming code that implements embodiments of the present invention, is generally stored on a computer readable and/or writeable nonvolatile recording medium and then copied into memory wherein it is then executed by the processor. Such programming code may be written in any of a plurality of programming languages, for example, Java, Visual Basic, C, C#, or C++, Fortran, Pascal, Eiffel, Basic, COBAL, or any of a variety of combinations thereof.
Having thus described several aspects of at least one embodiment of this invention, it is to be appreciated various alterations, modifications, and improvements will readily occur to those skilled in the art. Such alterations, modifications, and improvements are intended to be part of this disclosure, and are intended to be within the spirit and scope of the invention.
Accordingly, the foregoing description and drawings are by way of example only.
Number | Name | Date | Kind |
---|---|---|---|
6249616 | Hashimoto | Jun 2001 | B1 |
6611615 | Christensen | Aug 2003 | B1 |
7778490 | Quist | Aug 2010 | B2 |
20020141626 | Caspi | Oct 2002 | A1 |
20050094898 | Xu et al. | May 2005 | A1 |
20060034500 | Quist et al. | Feb 2006 | A1 |
20060034545 | Mattes et al. | Feb 2006 | A1 |
20060133694 | Dewaele | Jun 2006 | A1 |
20070086678 | Chefd'hotel et al. | Apr 2007 | A1 |
20080310760 | Carlsen et al. | Dec 2008 | A1 |
Entry |
---|
Jager et al., “A New Method for MRI Intensity Standardization with Application to Lesion Detection in the Brain”, Vision, Modeling, and Visualization 2006: proceedings, Nov. 22-24, 2006, pp. 269-276. |
Candocia, Frank M., “Simultaneous Homographic and Comparametric Alignment of Multiple Exposure-Adjusted Pictures of the Same Scene”, IEEE Transactions on Image Processing, vol. 12, No. 12, Dec. 2003 pp. 1485-1494. |
Flusser, et al., “A Moment-Based Approach to Registration of Images with Affine Geometric Distortion”, IEEE Transactions on Geoscience and Remote Sensing, vol. 32, No. 2, Mar. 1994, pp. 382-387. |
Lehmann, Thomas M., “A Two-Stage Algorithm for Model-Based Registration for Medical Images”, Proceedings of the 14th International Conference on Pattern Recognition ICPR'98, Brisbane, Australia, 1998. |
Mann, S. et al., “Video Orbits of the Projective Group: A Simple Approach to Featureless Estimation of Parameters”, IEEE Transactions on Image Processing, vol. 6. No. 9, Sep. 1997, pp. 1281-1295. |
Shum, et al., “Panoramic Image Mosaics”, Microsoft Research, Technical Report MSR-TR-97-23, Sep. 1997, pp. 1-53. |
Zitova et al., “Image Registration Methods: A Survey”, Image and Vision Computing 21 (2003) pp. 977-1000. |