This invention concerns face recognition, and in particular a computer method for performing face recognition. In further aspects the invention concerns software to perform the method and a computer system programmed with the software.
Face recognition is becoming increasingly important, particularly for security purposes such as automatically providing or denying access.
Most face recognition techniques only work well under quite constrained conditions. In particular, the illumination, facial expressions and head pose must be tightly controlled for good recognition performance. Among the nuisance variations, pose variation is the hardest to model.
An earlier invention by the same inventors is a method for facial feature processing described in International (PCT) application PCT/2007/001169. This method comprises the steps of:
This earlier invention proved to be able to improve recognition accuracy by up to about 60%.
The present invention is a method for face recognition, comprising the steps of:
Although the present invention is in some ways similar to the earlier invention there are several important distinctions. First, and most importantly, the present invention does not end with a synthesis of a frontal view. And, further processing is different in each case. This technique may deliver accuracy of up to about 70%.
The present invention may use Active Shape Models (ASM) which is a shorter version of AAM.
The pose independent features may be represented as a vector made up of parameters.
The pattern recognition techniques may involves measuring the similarity between the pose independent features of the face and pose independent features of gallery images.
The present invention may make use of pattern recognition techniques such as Mahalanobis or Cosine measure for classification.
The step of determining the orientation of the face may comprise determining the vertical and horizontal orientation of the face. This forms the basis for the pose angle of the face.
The step of removing the orientation of the face may comprise use of regression techniques.
The gallery may be comprised of pose independent features that each represent one member of the gallery. There may be only one independent feature of each member of the gallery. It is an advantage of at least one embodiment of the invention that multiple images of each member of the gallery with their face in different poses is not required.
The step of receiving the image may comprise capturing the image.
The method may be performed in real time.
In further aspects the present invention may extend to software to perform the method.
In yet a further aspect the present invention provides computer system (hardware) programmed with the software to perform the method described above. The computer system may comprise:
An example of the process of the invention will now be described with reference to the accompanying drawings, in which:
Referring first to
The next step involves a computer performing an Active Appearance Models (AAM) search and applying regression techniques 14 to first estimate angles representing the horizontal and vertical orientations of the face.
Further processing then involves applying a correlation model to remove any pose effect 16 so that the pose independent features of the face can be represented as a vector of parameters.
Finally, the processing applies pattern recognition techniques to compare the face with faces previously stored in a gallery 18, in order to see whether a match with a member of the gallery can be made. If a match is made the face is recognised.
The method may be performed in real time on a computer having application software installed to cause the computer to operate in accordance with the method. Referring to
The image of the person's face 34 is captured on a camera 32 of the system, either a still or video camera. This is received as input to the computer 30, such as by direct connection to an input port, over a local computer network (not shown) or over the internet (not shown). This image is processed according to steps 14, 16 and 18 described above. The representation of the captured face as pose independent features may also be stored in the memory of the computer 30. The gallery of images is stored in a database on memory 36 external to the computer 30, again by either direct connection, over a computer network (not shown) or over the Internet. Each record in the database corresponds to a member and comprises an image of the member's face and personal details.
The result of 18 may be displayed on the monitor of the computer 30 or printed on a printer 40. This may show the image as captured, the image of the member that matched the captured face, together with the corresponding personal details.
Each stage of the process will now be described in greater detail under the following subheadings:
Given a collection of training images for a certain object class where the feature points have been manually marked, a shape and texture can be represented by applying Principal Component Analysis (PCA) to the sample shape distributions as:
x=
s
c
g=
g
c
where
The model parameter c is related to the viewing angle, θ, approximately by a correlation model:
c=c
0
+c
c cos(θ)+cs sin(θ)
where c0, cc and cs are vectors which are learned from the training data. This considers only head turning, but nodding can be dealt with in a similar way.
For each of the image labelled with pose θ in the training set, the process performs Active Appearance Models (AAM) search to find the best fitting model parameters ci, then c0, cc and cs can be learned using regression from the vectors {ci} and vectors {(1, cos θi, sin θi)′}.
Given a new face image with parameters c, the process can estimate orientation as follows. The process first transforms c=c0+cc cos(θ)+cs sin(θ) to:
let Rc− be the left pseudo-inverse of the matrix (cc|cs), then it becomes
Let (xα, yα)′=Rc−(c−c0), then the best estimate of the orientation is
θ=tan−1(yα/xα)
After the process acquires the angle θ, the correlation model is used to remove pose effect. The equation c0+cc cos(θ)+cs sin(θ) represents the standard parameter vector at pose θ, note that its fixed at specific angle θ and changes when pose changes. Let cfeature be the feature vector which is generated by removing the pose effect from the correlation model by
c
feature
=c−(c0+cc cos(θ)+cs sin(θ))
Given any face image, the process can use Active Appearance Model (AAM) to estimate face model parameters c and use the correlation model as described above to remove the pose effect. Each face image then can be characterized by cfeature, which is pose independent.
Both the gallery face images and the given unknown face image can be represented by parameter vector cfeature. Recognizing a given face image is a problem of measuring the similarity between the parameter vector of the given face image and the vectors of the gallery images stored in the database. In experiments two different pattern recognition techniques were used: Mathalanobis distance and cosine measure for classification; these are described in detail below.
Mahalanobis distance is a distance measure method which was first introduced by P. C. Mahalanobis in 1936. It is a useful tool to measure the similarity between an unknown sample to a known one. It differs from Euclidean distance in that it takes into account the variability of the data set. Mahalanobis distance can be defined as
d({right arrow over (x)},{right arrow over (y)})=√{square root over (({right arrow over (x)}−{right arrow over (y)})TΣ−1({right arrow over (x)}−{right arrow over (y)})))}
where {right arrow over (x)} and {right arrow over (y)} are two vectors of the same distribution with the covariance matrix Σ.
Cosine measure is a technique that tries to measure the angle between different classes respecting to the origin. Cosine measure can be described as the equation:
where X and Z are two vectors, Larger angle of two vectors represents larger separation of two classes.
Results for High Angle Faces from Experiments
Using the face model and trained correlation model the process was applied using pose-independent feature on a database to compare the performance of various methods of synthesis APCA or synthesis PCA. Each face image is represented by cfeature of 43 dimensions. Both Mahalanobis distance and Cosine Measure were tried for classification.
And
From the recognition results in
Results for Frontal Faces from Experiments
To evaluate the performance of recognition by measuring pose-independent features on frontal faces, a dataset is formed by randomly selecting 200 frontal face images from the Feret Database (NIST 2001). Both APCA and the present process were tested on this dataset. Table 1 shows that APCA can reach 95% recognition rate on the frontal face images, which is the same as reported earlier (Chen and Lovell 2004; Lovell and Chen 2005); and the present process which measures the pose-independent feature by Mahalanobis distance and Cosine Measure can both reach 98% recognition rate, which shows that this process is also robust to frontal faces.
The invention can be applied to security applications, such as seeking to identify a person whose face is captured by a camera. Other applications include searching a set of photographs to automatically locate images (still or video) that include the face of a particular person. Further, the invention could be used to automatically organise images (still or videos) into groups where each group is defined by the presence of a particular person or persons face in the captured image.
Although the invention has been described with reference to a particular example, it should be appreciated that it could be exemplified in many other forms and in combination with other features not mentioned above.
Number | Date | Country | Kind |
---|---|---|---|
2007902984 | Jun 2007 | AU | national |
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/AU08/00760 | 5/29/2008 | WO | 00 | 5/7/2010 |