1. Field of the Invention
The present invention relates to image alignment. More particularly, the present invention relates to aligning partial images produced by swipe-style biometric sensing devices.
2. Related Art
In the field of biometric image analysis, traditional techniques sample an image, such as a fingerprint, as the image is moved or swiped across a sensing mechanism. This sensing mechanism, which could be a fingerprint sensor, captures partial images of the finger during a single swipe. This single swipe produces sets of data at different times and within different coordinate systems. Computer vision technology can then be used to reconstruct an image on the entire fingerprint by sampling these sets of data and combining the partial images to form a complete image of the fingerprint.
The process of transforming these different sets of data into one coordinate system is known to those of skill in the art as image registration. Registration is necessary in order to be able to compare, or integrate, the data obtained from different measurements.
Conventional image registration techniques fall within two realms of classification methods: (i) area-based and (ii) feature-based. The original image is often referred to as the reference image and the image to be mapped onto the reference image is referred to as the target image. For area based image registration methods, the technique looks at the structure of the image via correlation metrics, Fourier properties, and other means of structural analysis.
Most feature based methods, however, fine-tune their mapping to the correlation of image features. These features, for example, include lines, curves, points, line intersections, boundaries, etc. These feature based methods correlate images in lieu of looking at the overall structure of an image.
Both of these conventional image registration techniques, however, suffer shortcomings. For example, conventional techniques are susceptible to background noise, non-uniform illumination, or other imaging artifacts.
What is needed, therefore, is a robust image registration technique that can be used for biometric image analysis that reduces the effects of background noise, non-uniform illumination, and other imaging artifacts noted above in conventional approaches.
The present invention is directed to a method for analyzing image slices. The method includes transforming a first slice and a second slice to the frequency domain and determining shift data between the first slice and the second slice from only the phase component of the transformed first and second slices.
The present invention provides a unique approach for finding a relative shift in spatial domain in x and y directions between two partial images, particularly biometric images such as fingerprints. More specifically, the present invention provides a means to determine precise x and y coordinates, with a level of noise immunity, without the need to perform correlations. Precisely determining the extent of the x and y shifts between two successive partial images is fundamental to an accurate and seamless construction of an entire fingerprint reconstructed from all of the partial images.
The techniques of present invention virtually ignore background illumination problems. For example, if a background image associated with a fingerprint is gray or dark, this gray or dark background image, which could be mistakenly represented by ridges surrounding the fingerprint, is ignored. This process aids in a more precise determination of the x and y shifts.
Further embodiments, features, and advantages of the present invention, as well as the structure and operation of the various embodiments of the present invention are described in detail below with reference to accompanying drawings.
The accompanying drawings illustrate the present invention and, together with the description, further serve to explain the principles of the invention and to enable one skilled in the pertinent art to make and use the invention.
The present invention will now be described with reference to the accompanying drawings. In the drawings, like reference numbers generally indicate identical, functionally similar, and/or structurally similar elements. The drawing in which an element first appears is indicated by the leftmost digit(s) in the reference number.
This specification discloses one or more embodiments that incorporate the features of this invention. The embodiment(s) described, and references in the specification to “one embodiment”, “an embodiment”, “an example embodiment”, etc., indicate that the embodiment(s) described may include a particular feature, structure, or characteristic, but every embodiment may not necessarily include the particular feature, structure, or characteristic. Moreover, such phrases are not necessarily referring to the same embodiment. Furthermore, when a particular feature, structure, or characteristic is described in connection with an embodiment, it is submitted that it is within the knowledge of one skilled in the art to effect such feature, structure, or characteristic in connection with other embodiments whether or not explicitly described.
By way of background, the estimation of spatial shift between two image slices is mathematically equivalent to estimating a time delay between acoustic or radar signals received at two or more transducer locations. The accurate estimation of time delay of arrival (TDOA) between received signals plays a dominant role in numerous engineering applications of signal processing. Various TDOA estimation procedures have been proposed and implemented over the years, including cross-correlation functions, unit impulse response calculations, smoothed coherence transforms, maximum likelihood estimates, as well as many others.
A general discrete-time model used for TDOA estimation can be stated as follows:
u
0(n)=x(n)+s0(n) (1)
u
1(n)=x(n−D)+s1(n) (2)
where u0(n) and u1(n) are the two signals at the observation points (i.e. sensors), x(n) is the signal of interest that is referenced (zero time-delay) according to the first sensor and will have a delay of D by the time it arrives at the second sensor, and s0(n) and s1(n) are noise components of the first and second sensors, respectively.
The goal of TDOA estimation is to estimate D given a segment of data obtained from each sensor, without any prior knowledge regarding the source signal x(n) or the noises. This problem has been extensively explored in the past, and depending on the application at hand, different approaches have been proposed.
The most commonly used TDOA estimation technique is cross correlation. In cross correlation, an estimate D to the actual TDOA D is obtained by
Cross-correlation can be performed in the frequency domain leading to the formula
where U0(ejω) and U1(ejω) are the discrete-time Fourier transforms of the signals u0(n) and u1(n) respectively.
In 1972, for example, an ad hoc technique called the PHAse Transform for TDOA estimation in sonar systems was developed at the Naval Underwater Systems Center in New London, Connecticut. For more information on the PHAse Transform please see, “The Generalized Correlation Method for Estimation of Time Delay,” by Charles H. Knapp and G. Clifford Carter, IEEE transactions on Acoustics, Speech, and Signal Processing, Vol. ASSP-24, No. 4, August 1976 and “Theory and Design of Multirate Sensor Arrays,” by Omid S. Jahromi and Parham Aarabi, IEEE Transactions On Signal Processing, Vol. 53, No. 5, May 2005, which are both incorporated herein in their entireties. The PHAse Transform approach completely ignores the amplitude of the Fourier transforms and uses the following integral to estimate the time delay
The PHAse Transform can be interpreted as a form of “line fitting” in the frequency domain. Assume, for example, that noise is negligible and that the time delay D is much less than the length of observed signals u0(n) and u1(n). In this case, it could be safely assumed that u1(n) is very close to a circularly shifted version of u0(n). This means U1(ejω)≅U0(ejω)e−jωD or, equivalently, ∠U0(ejω)−∠U1(ejω)≅ωD.
The PHAse Transform integral (5) essentially tries to find a D for which the discrepancy between the line ωD and the phase error ∠U0(ejω)−∠U1(ejω) is minimum. There is, however, an important difference between the PHAse Transform approach and traditional methods (e.g., line fitting methods that use least-mean-square error to calculate the best fit). The PHAse Transform uses a cosine function to calculate the error between the measured Fourier phase difference ∠U0(ejω)−∠U1(ejω) and the perfect line ωD. This approach has the advantage that ±2π phase ambiguities, that occur while calculating the angle of complex Fourier transform coefficients, are automatically eliminated.
The use of Fourier transform phase for determining the time delay of arrival, for example, is well known to those of skill in the art. Recently, the PHAse Transform has been generalized to multi-rate signals by the inventor of the present application. For more information, please see “Theory and Design of Multirate Sensor Arrays,” by Omid S. Jahromi and Parham Aarabi, IEEE Transactions On Signal Processing, Vol. 53, No. 5, May 2005.
From a theoretical point of view, it is straightforward to generalize the one-dimensional PHAse Transform described above to estimate the spatial shift between two overlapping images. However, there are practical issues with this approach that must be addressed. These issues are addressed in the present invention, which applies the PHAse Transform technique to biometric image analysis. More specifically, the present invention uses the PHAse Transform technique to align overlapping fingerprint image slices and combine those overlapping image slices to form a complete seamless fingerprint image.
To apply the PHAse Transform technique to biometric image analysis, one can first multiply each of the partial images 200 with a carefully designed windowing function to smooth out the edges of the partial images.
After being smoothed, the partial images are then embedded within a larger image for expansion. That is, each of the partial images is zero-padded so that its area is extended to almost twice its original size.
In the present invention, while the phase image 410, associated with the slice 402, is important, the amplitude image 408 is not used and is therefore discarded. As can be observed in
It is important to note that the process discussed above with reference to the extended image slice 402, is repeated for the extended image slice 404. That is, phase and amplitude components associated with the extended image slice 404 are derived via application of an FFT, with the resulting amplitude component being discarded. The phase component of the extended image slice 404 (not shown) will also include wave-like patterns.
More specifically, a frequency of the wave-like patterns 412 from the phase image 410 and a corresponding phase image associated with the extended slice 404, represents a shift in the y (vertical) direction between these two successive images (i.e., the extended image slices 402 and 404). A tilt in the waves (with respect to a perfectly horizontal wavefront) represents a shift in the x (horizontal) direction between the image slices 402 and 404.
The exact values of the shifts in x and y directions between the extended image slices 402 and 404 can be determined by applying the PHAse Transform to their respective phase components. For purposes of illustration, the PHAse Transform can be expressed in the following exemplary manner:
In the expression (6) above, {hacek over (D)}1 precisely represents the shift in the x direction between the successive extended image slices 402 and 404. {hacek over (D)}2 precisely represents a shift in the y direction between these successive image slices. The present invention is not limited, however, to the particular expression (6) in that the PHAse Transform can be determined through numerous other methods. This process is then repeated, as described below, for all of the successive image slice pairs within the overlapping partial images 200. Precisely determining the shifts in the x and y directions is fundamental to an accurate and seamless construction of a complete fingerprint from partial images.
A windowing function, such as the Tukey window 300, is applied to each of the images 502 and 504 to provide the smoothing aspect noted above. After application of an appropriate windowing function, the resulting smoothed slices are embedded into a larger blank image for expansion. This expanding process produces the extended image slices 402 and 404. The extended image slices 402 and 404 are then transformed to image domain by applying an FFT, inherently producing complex products.
That is, in frequency domain, each of the extended image slices 402 and 404 has a corresponding amplitude and phase component. For example, the extended image slice 402 produces phase and amplitude components 410 and 408, respectively. Similarly, the extended image slice 404 produces phase and amplitude components 506 and 508, respectively. In accordance with the present invention, the amplitude components 408 and 508 are discarded.
Next, a PHAse Transform is applied to the phase components 410 and 506 to determine a shift in the horizontal and vertical directions between the successive extended image slices 402 and 404. In the example of
Aspects of the present invention can be implemented in software, hardware, or as a combination thereof. These aspects of the present invention may be implemented in the environment of a computer system or other processing system. An example of such a computer system 600 is shown in
In
The computer system 600 also includes a main memory 608, preferably random access memory (RAM), and may also include a secondary memory 610. The secondary memory 610 may include, for example, a hard disk drive 612 and/or a removable storage drive 614, representing a floppy disk drive, a magnetic tape drive, an optical disk drive, etc. The removable storage drive 614 reads from and/or writes to a removable storage unit 618 in a well known manner. The removable storage unit 618, represents a floppy disk, magnetic tape, optical disk, etc. which is read by and written to by removable storage drive 614. As will be appreciated, the removable storage unit 618 includes a computer usable storage medium having stored therein computer software and/or data.
In alternative implementations, the secondary memory 610 may include other similar means for allowing computer programs or other instructions to be loaded into the computer system 600. Such means may include, for example, a removable storage unit 622 and an interface 620. Examples of such means may include a program cartridge and cartridge interface (such as that found in video game devices), a removable memory chip (such as an EPROM, or PROM) and associated socket, and the other removable storage units 622 and the interfaces 620 which allow software and data to be transferred from the removable storage unit 622 to the computer system 600.
The computer system 600 may also include a communications interface 624. The communications interface 624 allows software and data to be transferred between the computer system 600 and external devices. Examples of the communications interface 624 may include a modem, a network interface (such as an Ethernet card), a communications port, a PCMCIA slot and card, etc. Software and data transferred via the communications interface 624 are in the form of signals 628 which may be electronic, electromagnetic, optical or other signals capable of being received by the communications interface 624. These signals 628 are provided to the communications interface 624 via a communications path 626. The communications path 626 carries the signals 628 and may be implemented using wire or cable, fiber optics, a phone line, a cellular phone link, an RF link and other communications channels.
In the present application, the terms “computer readable medium” and “computer usable medium” are used to generally refer to media such as the removable storage drive 614, a hard disk installed in the hard disk drive 612, and the signals 628. These computer program products are means for providing software to the computer system 600.
Computer programs (also called computer control logic) are stored in the main memory 608 and/or the secondary memory 610. Computer programs may also be received via the communications interface 624. Such computer programs, when executed, enable the computer system 600 to implement the present invention as discussed herein.
In particular, the computer programs, when executed, enable the processor 604 to implement the processes of the present invention. Accordingly, such computer programs represent controllers of the computer system 600. By way of example, in the embodiments of the invention, the processes/methods performed by signal processing blocks of encoders and/or decoders can be performed by computer control logic. Where the invention is implemented using software, the software may be stored in a computer program product and loaded into the computer system 600 using the removable storage drive 614, the hard drive 612 or the communications interface 624.
Example embodiments of the methods, systems, and components of the present invention have been described herein. As noted elsewhere, these example embodiments have been described for illustrative purposes only, and are not limiting. Other embodiments are possible and are covered by the invention. Such other embodiments will be apparent to persons skilled in the relevant art(s) based on the teachings contained herein. Thus, the breadth and scope of the present invention should not be limited by any of the above described exemplary embodiments, but should be defined only in accordance with the following claims and their equivalents.
The foregoing description of the specific embodiments will so fully reveal the general nature of the invention that others can, by applying knowledge within the skill of the art, readily modify and/or adapt for various applications such specific embodiments, without undue experimentation, without departing from the general concept of the present invention. Therefore, such adaptations and modifications are intended to be within the meaning and range of equivalents of the disclosed embodiments, based on the teaching and guidance presented herein. It is to be understood that the phraseology or terminology herein is for the purpose of description and not of limitation, such that the terminology or phraseology of the present specification is to be interpreted by the skilled artisan in light of the teachings and guidance.
The breadth and scope of the present invention should not be limited by any of the above-described exemplary embodiments, but should be defined only in accordance with the following claims and their equivalents.