The system and method described below relate to the identification of a person or an animal with reference to external physical characteristics of the person or animal, and, more specifically, with reference to externally observable physical characteristics of one or more eyes of the person or animal.
Systems for identifying persons through intrinsic human traits have been developed. These systems operate by taking images of a physiological trait of a person and comparing information stored in the image to image data corresponding to the imaged trait for a particular person. When the information stored in the image has a high degree of correlation to the relevant data previously obtained for a particular person's trait, positive identification of the person may be obtained. These biometric systems obtain and compare data for physical features, such as fingerprints, voice, facial characteristics, iris patterns, hand geometry, retina patterns, and hand/palm vein structure. Different traits impose different constraints on the identification processes of these systems. For example, fingerprint recognition systems require the person to be identified to contact an object directly for the purpose of obtaining fingerprint data from the object. Similarly, retina pattern identification systems require a person to allow an imaging system to scan the retinal pattern within one's eye for an image capture of the pattern that identifies a person. Facial feature recognition systems, however, do not require direct contact with a person and these biometric systems are capable of capturing identification data without the cooperation of the person to be identified.
One trait especially suited for identification is sclera patterns in a person's eye. The human eye sclera provides a unique trait that changes little over a person's lifetime. It also provide multi-layer information that can be used for liveness test. It is important to design a method to segment and match the sclera pattern accurately and robustly.
A method has been developed that obtains an identification characteristic from the sclera of a subject's eye. The method includes acquiring an image of an eye of a subject, segmenting the eye image into different regions, extracting features in a sclera region segmented from the eye image, and generating data identifying at least one feature extracted from the sclera region of the eye image.
A system that implements the method also obtains an identification characteristic from the sclera of a subject's eye. The system includes a digital camera configured to acquire an image of an eye of a subject, a digital image processor configured to segment the eye image into different regions, to extract features in a sclera region segmented from the eye image, and to generate data identifying at least one feature extracted from the sclera region of the eye image, and a database for storage of identifying data.
The method and system discussed below use patterns in the sclera of an eye, especially a human eye, for identification. Therefore, the present invention provides an identification technique based on the recognition of the unique features of the sclera, referred to herein as “Sclera Recognition”. In general, the method of identification includes illuminating an eye, obtaining an image of the eye (sclera, iris, and pupil), segmenting the image, extracting features from the sclera region, registering those extracted features, and generating a template. This template may be stored and compared to templates obtained from eye images of other subjects to identify the subsequent subject as being the same subject from which the stored template was obtained.
An illustration of a human eye is shown in
A method for identifying a person from an image of a person's eye is shown in
The process of
The binary data representing pixels of the eye image are converted from an initial red, green, blue (RGB) color space, to an intermediate luma, red-difference, blue-difference (YCrCb) color space, and then into a hue, saturation, brightness value (HSV) color space (block 312) using transformations listed below.
In Equation 1, the red (R), green (G), and blue (B) numeric values corresponding to each pixel undergo a cross product matrix transform with the weighting matrix of equation 1. The numeric RGB values in equation 1 are in a range of 0 to 255, but larger or smaller ranges using modified matrix coefficients are possible for alternative binary image formats. The resulting matrix is then adjusted by adding 128 to each of the Cr and Cb values, resulting in a YCrCb matrix. The YCrCb matrix is then transformed into an HSV matrix using the transformations listed in equation 2. The transformations listed above are repeated for each pixel in the eye image, producing an image in the HSV color space. Both the original RGB and the transformed HSV color space values are used in different steps of the segmentation process of
The process of
Equation 3 describes an example CDM/for a photograph taken using a natural source of illumination, such as sunlight. Equation 4 is an example CDM2 for a photograph taken using a flash illuminator. Each of the CDMs is described using the red, green, blue (RGB) color space, with typical values of each RGB component ranging from 0 to 255. The CDM may be applied to each RGB pixel, yielding a 0 or 1 depending upon the pixel's color threshold value. Depending upon the illumination source, either CDM/or CDM2 is used produce a binary sclera map S at each pixel position x, y of the original image as show in equation 5. The two dimensional binary sclera map contains a value of 1 corresponding to a pixel determined to be in the sclera, or 0 for a pixel determined to be outside the sclera.
Another method of producing a sclera map is rooted in the observation that the sclera is also known as the “white” portion of the eye. Using the HSV values image values, each pixel in the image is assigned a 0 or 1 value according to the threshold equation below.
The threshold values thh, ths, and thv in Equation 6 are heuristically determined in order to set the threshold for the hue in the approximately lower ⅓ of the image, the saturation in the approximately lower ⅖ of the image, and the brightness in the upper ⅔ of the image. These heuristic thresholds are calculated based on histogram distributions of each image using the equations listed below.
th
h=arg{t|min|Σx=1tph(x)−Th|}, Equation 7
th
s=arg{t|min|Σx=1tps(x)−Ts|}, Equation 8
and thv=arg{t|min|Σx=1tpv(x)−Tv|} Equation 9
In Equations 7-9, ph(x) is the normalized histogram of the hue image, ps(x) is the normalized histogram of the saturation image, and pv(x) is the normalized histogram of the value image. The fixed threshold values Th, Ts, and Tv are chosen to be ⅓, ⅖, and ⅔, respectively, matching the preferred thresholds discussed above. The result S2(x,y) is the binary sclera map produced with the HSV method.
Morphological operations are applied to each binary sclera map S1 and S2 in order to eliminate stray pixels that do not match the surrounding pixels. The preferred result of the morphological operations are two contiguous regions in each binary sclera map corresponding to the portions of the sclera 22 on the left and right side of the iris 18 as depicted in
Continuing to refer to
The complete estimated sclera region is formed by selectively combining the binary maps S1 and S1 after the convex hull operation has been completed (block 324). The regions to be selected for the combined binary sclera map are chosen based on homogeneity of the pixel colors from the original image that correspond to the sclera regions of each binary sclera map. The sclera map region that corresponds to the most homogeneous color distribution in the original downsampled image is chosen to represent the sclera area. The homogeneity is determined using the equation below.
The preferred region r is selected from one of the two binary sclera maps based on the minimum standard deviation between image intensity of individual I(x, y) in the region, and the mean intensity mi for all pixels in the region. The lower standard of deviation from the mean value indicates a more homogenous region, and that region is chosen to represent the estimated sclera area.
An example of two binary sclera maps and a combined estimated sclera area map 400 is depicted in
Referring again to
The refined estimated binary sclera region forms an image mask pattern (block 336). This mask is then upsampled back to the size of the original image (block 340). In the example of
Returning to the process of
G(x, y, e, s) represents one or more two-dimensional Gabor filters oriented in different directions. In the Gabor filter of Equation 11, x and y represent the center frequency of the filter, θ is the angle of sinusoidal modulation, and s is the variance of a Gaussian function. I(x, y) is the pixel intensity at each point in the image which is convolved with the Gabor filter. The convolution results in a filtered Gabor image IF (x, y, θ, s) of Equation 12 at a given orientation θ and Gaussian variance s. By altering θ, the Gabor filters may be placed into multiple orientations producing a unique filtered image function IF (x, y, θ, s) for each value of θ.
Two example Gabor filter sets are depicted in
The multiple Gabor filtered images are fused into a vein-boosted image F(x, y) using the following equation.
F(x,y)=√{square root over (ΣθεθΣsεS(IF(x,y,θ,s))2)} Equation 13
An example of a portion of the sclera image without Gabor filtering is depicted in
The Gabor filtered image is converted to a binary image using an adaptive threshold, which is determined based on the distribution of filtered pixel values via the following equations.
The binary enhanced image B(x, y) has a 1 value where the fused Gabor-filtered image F(x, y) exceeds threshold thb, and is 0 otherwise. The threshold thb is calculated using pedge, the normalized histogram of the non-zero elements of F(x, y). Tb is selected to be ⅓ in the example embodiment because the zero elements of the filtered image may outnumber the non-zero image elements, and the vascular patterns often have a higher magnitude than the background. An example of a binary enhanced image of the segmented sclera region is depicted in
Referring to the example process of
The line segment descriptor technique continues with a line parsing sequence. The line parsing sequence converts the curved lines representing sclera vein features into a series of linear elements that may be stored in a computer database. The original vein features are approximated by line segments, and the line segments are recursively split into smaller segments until the vein features are substantially linear.
An example of a line segment in the parsing sequence is depicted in
During eye image acquisition, the eye may move or the camera may be positioned at an angle with respect to the eye being imaged. These factors affect the location of the sclera patterns presented inside the image. To remove these effects, sclera template registration is performed (block 220). The location of the limbic boundary, pupil center, limbic center, medial, and/or lateral may be used to define a location of features/patterns and further used for registration of the features/patterns. Sclera region of interest (ROI) selection achieves global translation, rotation, and scaling-invariance. In addition, due to the complex deformation that can occur in the vein patterns, the registration scheme accounts for potential changes in vein patterns while maintaining recognition patterns with acceptable false-positive rates.
A technique for registering the sclera template features based on random sample consensus (RANSAC) is an iterative model-fitting method that can register the sclera features. The RANSAC method registers the coordinates of the center points 618 of each linear element 628 used to model the sclera veins. The center point coordinates and angles, but not the lengths of the linear descriptors are registered in order to prevent false-accepts due to over-fitting the registered features.
When an eye has been registered via the registration process in block 220, an identification system may use the registered template to perform matching between the stored template and an image of a candidate eye. This process commonly occurs in biometric authentication systems where a user is registered with the identification system, and the user authenticates with the identification system at a later time. The registration minimizes the minimum distance between the recorded test template and the target templates that are acquired later for matching with the test template. This reduces artificially introduced false accepts because different parameters are used for registration than are used for matching, so the preferred registration and preferred matching is different for templates that should not match. The registration process randomly chooses two points—one Sx, from the test template, and one Syj from the target template. The registration process also randomly chooses a scaling factor and a rotation value, based on apriori knowledge of the template database. Using these values, a fitness value for the registration using these parameters is calculated for each segment in the image using the segment's polar coordinates r and θ and the line segment's angle ø. The test template parameter Sxi and target template parameter Syj, are defined below.
An offset vector is calculated using the shift offset and randomly determined scale and angular offset values, s0 and θ0. The polar coordinates are transformed into Cartesian coordinates and combined with the scale s0 and angular offset θ0 to describe the line segment φ0 in equations 17-19.
The fitness of two descriptors is the minimal summed pairwise distance between the two descriptors Sx and Sy given offset vector, φ0.
Where f(Sxi,φ0) is the function that applies the registration given the offset vector to a sclera line descriptor.
The minimum pairwise distance between the selected test template line segment Sxi and the entire set of target template line segments Sy is calculated using the equation listed below. The line segment in the target template closest to segment Sxi is assumed to be the nearest matching sclera vein segment. This calculation allows for matching when the sclera veins in the target image have moved after the test template was recorded.
minDist(Sxi,Sy)=argminj{d(Sxi,Syj)} Equation 23
With the distance between two points calculated using the equation listed below.
d(Sxi,Syj)=√{square root over ((xo)2+(yo)2)}{square root over ((xo)2+(yo)2)} Equation 24
Where, Test is the set of descriptors in the stored test template, Target is the set of descriptors in the newly acquired target template, Sxi is the first descriptor used for registration, Syj is the second descriptor, φ0 is the set of offset parameter values, f(Sxi, φ0) is a function that modifies the descriptor with the given offset values, S is the scaling factor, and is the rotation value which is determined by the sclera image resolution and system application. The process performs some number of iterations, recording the values φ0 that are minimal in D(Sx, Sy).
A template for future comparisons is generated from the extracted features (block 220). The template may be, for example, a bit map of the sclera region that identifies the features, a list of features with positional information about each feature, or a set of descriptors for the extracted features. To generate the template, the location of the registered extrema points or a set of descriptors for the features are saved. Descriptors refer to the model parameters for the detected feature edges that were obtained through known curve fitting techniques, wavelets, neural network, filtering methods, and other pattern recognition methods. The parameters of the fitted curves are saved as descriptors for the extrema points. The template may be represented in a binary, integer, or floating number format. The template may now be used to identify another image of an eye as corresponding to or not corresponding to the template. The template generation process described above is merely illustrative of an appropriate process for modeling an image of the eye. Alternative processes include using a set of area, line, and/or point descriptors; a set of wavelet co-efficiencies, magnitudes, phases, or a combination thereof; and a set of vectors and/or matrices.
An identification or matching process (block 224,
The matching process compares the line segments Si in the stored template with line segments Sj stored in the template generated from the second eye image. The matching process produces a match score m(Si, Sj) for each line segment in the stored template using the equation below.
In Equation 25, d(Si, Sj) is the Euclidian distance between the center points of Si and Sj calculated in Equation 23. Dmatch is a predetermined threshold value for the distance between line segments considered to match, and ømatch represents the differences between the angles of line segments considered to match. In the example embodiment of
The matching scores m(Si, Sj) for individual line segments are summed to produce an overall matching score M using the equation below.
Matches is the set of all line segments that matched with non 0 values, Test is the stored template that is used to test the new template Target to determine if the sclera images match. If the sum M exceeds a predetermined threshold, then the new image is considered to match the stored template, otherwise the two images are not considered matches. In an embodiment of a biometric identification system using the process of
A system that may be used to implement the image processing method described above is shown in
In another embodiment, a system and method may use structure within an iris as well as patterns within a sclera to improve recognition accuracy and increase the degree of the freedom in subject recognition. The use of iris and sclera features for subject identification is referred to herein as “iris and sclera multimodal recognition”. A method for implementing iris and sclera multimodal recognition includes eye illumination, eye image acquisition (sclera, iris, and pupil), image segmentation, feature extraction, feature registration, and template generation. This method is similar to the one described above except the image segmentation retains the iris in the acquired image, the feature extraction includes structure within the iris, and the template generation identifies sclera patterns and iris structure. The feature extraction of iris could be wavelet based method, descriptor based method, and/or spatial-domain method. As described above, a stored template may be compare to a template obtained from another acquired eye image to identify the subject in the second image as corresponding to or not corresponding to the subject for the stored template.
The comparison of templates in the iris and sclera multimodal system may use feature level fusion, template level fusion, and/or score level fusion. For example, the sclera and iris regions may be processed separately and the templates generated for each separate region may then be stored for later comparison. Templates generated from a later acquired image for both the iris and sclera areas may be separately compared to one or more stored templates to generate a pair of matching scores. If both matching scores are higher than the matching thresholds, the subjects are deemed the same. If one of the scores does not meet or exceed the matching threshold, the context of the recognition scenario may be used to determine the criteria for a match. For example, in a highly secured situation, one low matching score may be sufficient to evaluate a subject as not corresponding to the subject for a stored template. In a less secured scenario, such as access to a home computer, one matching score exceeding one threshold by a predetermined percentage may be adequate to declare the subjects as corresponding to one another. In a similar manner, sclera recognition may be combined with face recognition, skin tissue recognition, or some other biometric characteristic recognition system to improve recognition accuracy for the system.
Those skilled in the art will recognize that numerous modifications can be made to the specific implementations described above. Therefore, the following claims are not to be limited to the specific embodiments illustrated and described above. The claims, as originally presented and as they may be amended, encompass variations, alternatives, modifications, improvements, equivalents, and substantial equivalents of the embodiments and teachings disclosed herein, including those that are presently unforeseen or unappreciated, and that, for example, may arise from applicants/patentees and others.
This application claims priority from pending U.S. Provisional Application No. 61/144,508 filed on Jan. 14, 2009 and from U.S. Provisional Application No. 61/260,451 which was filed on Nov. 12, 2009.
The subject matter of this application was made in part with U.S. government support under grant N00014-07-1-0788 awarded by the Office of Naval Research. The U.S. government has certain rights to this invention.
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/US10/20991 | 1/14/2010 | WO | 00 | 7/14/2011 |
Number | Date | Country | |
---|---|---|---|
61144508 | Jan 2009 | US | |
61260451 | Nov 2009 | US |