1. Field of the Invention
The present invention relates to an object detection apparatus which is used to detect a specific object such as a face in an input image.
2. Description of the Related Art
Conventionally, examples of a method of detecting the specific object such as the face from the input image include a method of applying template matching to reduced images which are hierarchically produced to the input image (PP. 203, Digital Image Processing, CG-ARTS Society) and a method of converting the input image into an image called integral image to integrate a weight corresponding to a size of rectangular feature amount (U.S. Patent Application No. 2002/0102024 A1). A method of narrowing down object candidates of the hierarchical image with motion information or color information is proposed as a method of reducing a processing time (see Japanese Patent Laid-Open No. 2000-134638).
In the conventional techniques, a determination whether or not the specific object exists in a determination region is made while the determination region is slightly moved on the input image. In the method of applying the template matching, a correlation and a differential sum of squares are frequently used in the matching, and it takes a long time to perform the computation. In the method in which the integral image is used, it has been confirmed that the method is operated at relatively high speed on a personal computer. However, a large memory resource is required to perform the conversion into the integral image and the computation of the rectangular feature amount, and a large load also applied to CPU. Therefore, the method in which the integral image is used is not suitable to implementation on a device.
The method of narrowing down the object candidates with the motion information or color information is hardly applied when the specific object is not moved. Additionally, because the color information heavily depends on the light source color and the like, it is difficult to make the stable detection.
In view of the foregoing, an object of the invention is to provide an object detection apparatus which can perform the high-speed processing with higher accuracy while decreasing the memory resource and the load to CPU.
An object detection apparatus according to a first aspect of the invention which detects a specific object in an input image, including specific object detection means for performing a specific object detecting process of setting the input image or a reduced image of the input image as a target image, and of determining whether or not the specific object exists in a determination region while scanning the determination region in the target image or an edge feature image of the target image, wherein the specific object detection means includes determination means for determining whether or not the specific object exists in the determination region, based on an edge feature amount of the edge feature image corresponding to the determination region, and a previously determined relationship between an edge feature amount and a weight indicating object likelihood for each predetermined feature pixel in an image having the same size as the determination region.
An object detection apparatus according to a second aspect of the invention which detects a specific object in an input image, including: reduced-image generating means for generating one or a plurality of reduced images from the input image; and specific object detection means for performing a specific object detecting process of setting each of a plurality of hierarchical images as a target image, and of determining whether or not the specific object exists in a determination region while scanning the determination region in the target image or an edge feature image of the target image, the plurality of hierarchical images including the input image and one or a plurality of reduced images of the input image, wherein the specific object detection means includes determination means for determining whether or not the specific object exists in the determination region, based on an edge feature amount of the edge feature image corresponding to the determination region, and a previously determined relationship between an edge feature amount and a weight indicating object likelihood for each predetermined feature pixel in an image having the same size as the determination region.
In the object detection apparatus according to the first or second aspect of the invention, preferably the specific object detection means includes a specific object detecting table, which is previously prepared from a plurality of sample images including the specific object and stores the previously determined relationship between an edge feature amount and a weight indicating object likelihood for each predetermined feature pixel in the image having the same size as the determination region; and the determination means determines whether or not the specific object exists in the determination region based on an edge feature amount of the edge feature image corresponding to the determination region and the specific object detecting table.
In the object detection apparatus according to the first or second aspect of the invention, preferably the determination means includes plural determination processing means having the different numbers of feature pixels which are used in determination for the determination region at any position, a determination process is performed in the order from the determination processing means having the smaller number of feature pixels used in the determination, and, when any determination processing means determines that the specific object does not exist, a subsequent process performed by the determination processing means is aborted.
In the object detection apparatus according to the first or second aspect of the invention, preferably the edge feature image is plural kinds of edge feature images having different edge directions.
In the object detection apparatus according to the first or second aspect of the invention, preferably the specific object detection means performs the specific object detecting process using a determination region having a single size and one kind of the specific object detecting table corresponding to the size of the determination region.
In the object detection apparatus according to the first or second aspect of the invention, preferably the specific object detection means prepares plural kinds of the determination regions having the different sizes, the specific object detection means holds the plural specific object detecting tables according to the plural kinds of the determination regions, the specific object detection means sets the plural kinds of the determination regions in the target image or the edge feature image of the target image, and the specific object detection means performs the specific object detecting process in each set determination region using the specific object detecting table corresponding to the determination region.
In the object detection apparatus according to the second aspect of the invention, preferably the specific object detection means prepares the determination region having the different size in each hierarchical target image, the specific object detection means holds the plurality of specific object detecting tables according to the determination regions, the specific object detection means performs a specific object roughly-detecting process to a lower hierarchical target image or the edge feature image of the lower hierarchical target image using the determination region corresponding to the lower hierarchy and the specific object detecting table corresponding to the determination region of the lower hierarchy when the specific object detection means performs the specific object detecting process to an arbitrary hierarchy, and the specific object detection means performs the specific object detecting process to the hierarchical target image or the edge feature image of the hierarchical target image using the determination region corresponding to the arbibrary hierarchy and the specific object detecting table corresponding to the determination region of the arbitrary hierarchy when a face is detected in the specific object roughly-detecting process.
In the object detection apparatus according to the first or second aspect of the invention, preferably the specific object detection means prepares plural kinds of the determination regions having the different sizes, the specific object detection means holds the plural specific object detecting tables according to the plural kinds of the determination regions and a specific object roughly-detecting table for detecting faces having all the sizes, the face being able to be detected by each determination region, the specific object detection means sets a common determination region including all the kinds of the determination regions in the target image or the edge feature image of the target image, the specific object detection means performs the specific object roughly-detecting process using the specific object roughly-detecting table, and the specific object detection means sets the plural kinds of the determination regions in the target image or the edge feature image of the target image and performs the specific object detecting process in each set determination region using the specific object detecting table corresponding to the determination region when a face is detected in the specific object roughly-detecting process.
In the object detection apparatus according to the first or second aspect of the invention, preferably the edge feature image is an edge feature image corresponding to each of the four directions of a horizontal direction, a vertical direction, an obliquely upper right direction, and an obliquely upper left direction, the feature pixel of the specific object detecting table is expressed by an edge number indicating an edge direction and an xy coordinate, a position in which the edge number of the feature pixel and/or the xy coordinate is converted by a predetermined rule is used as a position on the edge feature image corresponding to any feature pixel of the specific object detecting table, and the specific object which is rotated by a predetermined angle with respect to a default rotation angle position of the specific object can be detected by the post-conversion position.
In the object detection apparatus according to the first or second aspect of the invention, preferably the edge feature image is an edge feature image corresponding to each of the four directions of a horizontal direction, a vertical direction, an obliquely upper right direction, and an obliquely upper left direction, the feature pixel of the specific object detecting table is expressed by an edge number indicating an edge direction and an xy coordinate, a position in which the edge number of the feature pixel and/or the xy coordinate is converted by a predetermined rule is used as a position on the edge feature image corresponding to any feature pixel of the specific object detecting table, and the specific object in which a default attitude is horizontally flipped or the specific object in which a default attitude is vertically flipped can be detected by the post-conversion position.
In the object detection apparatus according to the first or second aspect of the invention, preferably weights indicating the object likelihood are stored in the specific object detecting table for each predetermined feature pixel of the image having the same size as the determination region, the weights corresponding to the respective edge feature amounts which are possibly taken in the feature pixel.
In the object detection apparatus according to the first or second aspect of the invention, preferably coefficients of a polynomial are stored in the specific object detecting table for each predetermined feature pixel of the image having the same size as the determination region, the polynomial representing the edge feature amounts possibly taken in the feature pixel and the weights indicating the object likelihood.
A face detection apparatus according to a preferred embodiment of the invention will be described below with reference to the drawings.
The face detection apparatus of the first embodiment includes AD conversion means 11, reduced image generating means 12, four-direction edge feature image generating means 13, a memory 14, face determination means 15, and detection result output means 16. The AD conversion means 11 converts input image signal into digital data. The reduced image generating means 12 generates one or plural reduced images based on the image data obtained by the AD conversion means 11. The four-direction edge feature image generating means 13 generates an edge feature image of each of four directions in each hierarchical image which is formed by an input image and a reduced image. A face detecting weighting table obtained from a large amount of teacher samples (face sample images and non-face sample images) is stored in the memory 14. The face determination means 15 determines whether or not the face exists in the input image using the weighting table and the edge feature image in each of the four directions generated by the four-direction edge feature image generating means 13. The detection result output means 16 outputs the detection result of the face determination means 15. When the face is detected, the detection result output means 16 outputs a size and position of the face based on the input image.
First the input image is obtained (Step S1), and one or plural reduced images are generated from the input image using a predetermined reduction ratio (Step S2). The edge feature image is generated in each of the four directions in each hierarchical image which is formed by the input image and the reduced image (Step S3). A face detecting process is performed using each edge feature image and the weighting table (Step S4), and the detection result is delivered (Step S5). When a command for ending the face detection is not inputted (Step S6), the flow returns to Step S1. When the command for ending the face detection is inputted in Step S6, the flow is ended.
In the example of
The hierarchical image to be processed is inputted (Step S11). An edge enhancement process is performed to the inputted hierarchical image to respectively generate the first edge enhancement images corresponding to the four directions using a Prewitt type differentiation filter which corresponds to each of the four directions of a horizontal direction, a vertical direction, an obliquely upper right direction, and an obliquely upper left direction as shown in
(5-1) Weighting Table
Although the face detecting process in Step S4 of
In the size of the determination region 41, it is assumed that a pixel position of each edge feature image is expressed by a kind q (edge number: 0 to 3) of the edge feature image and a row number y (0 to 7) and a column number x (0 to 7). A weight w is stored in the weighting table. In each feature pixel used for the face detection in the pixels of each edge feature image, the weight w indicates face likelihood corresponding to a feature amount (pixel value) within the pixel.
In the example of
Such weighting tables can be produced using a known learning method called adaboost (Yoav Freund and Robert E. Schapire, “A decision theoretic generalization of on-line learning and an application to boosting”, European Conference on Computational Leaning Theory, Sep. 20, 1995).
The adaboost is one of adaptive boosting learning methods. In the adaboost, plural weak classifiers which are effective for classification are selected from plural weak classifier candidates based on a large amount of teacher samples, and high-accuracy classifier is realized by weighting and integrating the weak classifiers. As used herein, the weak classifier shall mean a classifier which is not enough to satisfy the accuracy while having a classification capability higher than a pure accident. In selecting the weak classifier, when the already-selected weak classifier exists, the learning focuses on a teacher sample which is wrongly recognized by the already-selected weak classifier, which selects the weak classifier having the highest effect from the remaining weak classifier candidates.
The face detecting process is performed in each hierarchical image using the weighting table and the four-direction edge feature images corresponding to the determination region set in the image.
(5-2) Procedure of Face Detecting Process
The face detecting process includes determination steps from a first determination step (Step S21) to a sixth determination step (Step S26). The determination steps differ from one another in the number of feature pixels N used for the determination. In the first determination step (Step S21) to the sixth determination step (Step S26), the numbers of feature pixels N used for the determination are set to N1 to N6 (N1<N2<N3<N4<N5<N6) respectively.
When the face is not detected in a certain determination step, the flow does not go to the next determination step, but it is determined that the face does not exist in the determination region. Only when the face is detected in all the determination steps, it is determined that the face exists in the determination region.
(5-3) Procedure of Determination Process in Each Determination Step
The case where the determination is made to one determination region using N feature pixels will be described below. The determination region is set (Step S31), and a variable S indicating a score is set to zero while a variable n expressing the number of feature pixels in which the weight is obtained is set to zero (Step S32).
A feature pixel F(n) is selected (Step S33). As described above, the feature pixel F(n) is expressed by the edge number q, the row number y, and the column number x. In the example of
A pixel value i(n) corresponding to the selected feature pixel F(n) is obtained from the edge feature image corresponding to the determination region (Step S34). A weight w(n) corresponding to the pixel value i(n) of the feature pixel F(n) is obtained from the weighting table (Step S35). The obtained weight w(n) is added to the score S (Step S36).
Then, n is incremented by one (Step S37). It is determined whether or not n is equal to N (Step S38). When n is not equal to N, the flow returns to Step S33, and the processes of Steps S33 to S38 are performed using the updated n.
When the processes of Steps S33 to S36 are performed to the N feature pixels, n becomes equal to N in Step S38, so that the flow goes to Step S39. In Step S39, when the number of feature pixels is N, it is determined whether or not the score S is more than a predetermined threshold Th. When the score S is more than the threshold Th, it is determined that the face exists in the determination region (Step S40). On the other hand, when the score S is not more than the threshold Th, it is determined that the face does not exist in the determination region (Step S41).
As shown in
The face detecting process includes a first determination step (Step S121), a second determination step (Step S123), and a third determination step (Step S124). The determination steps differ from one another in the number of feature pixels N used for the determination. The first determination step to the third determination step, the numbers of feature pixels N used for the determination are set to N1 to N3 (N1<N2<N3) respectively. The process similar to the process shown in
When the face is not detected in the first determination step (Step S121), the flow does not go to the next determination step, but it is determined that the face does not exist in the determination region. When the face is detected in the first determination step (Step S121), it is determined whether or not the score S computed in the first determination step is not lower than the default (Step S122). In the first determination step, the default is set to a value larger than the threshold Th used to determine whether or not the face exists.
When the score S is lower than the default, flow goes to the second determination step (Step S123). In Step S123, the process of the second determination step is performed like
In the first embodiment, the face detecting process is performed using the weighting table. As shown in
In the modification, a coefficient table in which a polynomial coefficient is stored in each feature pixel used for the face detection is used in place of the weighting table. The coefficient table is produced from the same data as the weighting table. A method of producing the coefficient table for a certain feature pixel will be described below.
Usually a least square method is used in fitting the table value to polynomial curve. That is, the coefficient of the function is obtained in each pixel value to minimize a square of a difference between the table value and the function with which the table value is approximated. Assuming that w(n) is the weight of the feature pixel F(n) and i(n) is the pixel value of the feature pixel F(n), the three-dimensional fitting function is expressed by the following equation (1).
w(n)=a3·i(n)3+a2·i(n)2+a1·i(n)+a0 (1)
The coefficient value of each feature pixel is obtained in each feature pixels by determining the coefficient values a0, a1, a2, and a3 such that the square of the table value and the function becomes the minimum in each pixel value.
In the case where the coefficient table is used in place of the weighting table, a determination process shown in
The case where the determination is made using the N feature pixels will be described. First the determination region is set (Step S131). The variable S indicating the score is set to zero while the variable n indicating the number of feature pixels in which the weight is obtained is set to zero (Step S132).
Then, the feature pixel F(n) is selected (Step S133). The feature pixel F(n) is expressed by the edge number q, the row number y, and the column number x. In the example of
A pixel value i(n) corresponding to the selected feature pixel F(n) is obtained from the edge feature image corresponding to the determination region (Step S134). The coefficient values a0, a1, a2, and a3 of the polynomial corresponding to the feature pixel F(n) are obtained from the coefficient table (Step S135). The weight w(n) is computed from the polynomial of the equation (1) using the obtained pixel value i(n) and the coefficient values a0, a1, a2, and a3 (Step S136). The obtained weight w(n) is added to the score S (Step S137).
Then, n is incremented by one (Step S138). It is determined whether or not n is equal to N (Step S139). When n is not equal to N, the flow returns to Step S133, and the processes of Steps S133 to S139 are performed using the updated n.
When the processes of Steps S133 to S138 are performed to the N feature pixels, n becomes equal to N in Step S139, so that the flow goes to Step S140. In Step S140, when the number of feature pixels is N, it is determined whether or not the score S is more than a predetermined threshold Th. When the score S is more than the threshold Th, it is determined that the face exists in the determination region (Step S141). On the other hand, when the score S is not more than the threshold Th, it is determined that the face does not exist in the determination region (Step S142).
The weighting table of
The detection ratio shown in the vertical axis shall mean a ratio of the number of faces which is successfully detected to the total number of faces included in the evaluation image. The false detection ratio shown in the horizontal axis shall mean a ratio of the number of times at which a non-face portion is wrongly detected as the face to the number of evaluation images. The relationship between the detection ratio and the false detection ratio draws one curve by changing a setting value (threshold Th) of detection sensitivity. Each plot (round or square point) on a polygonal line graph of
Preferably the detection ratio is increased, and the data indicating the relationship between the detection ratio and the false detection ratio is located on the upper side in
The reason is as follows. The weight w of the weighting table is computed based on the large amount of learning data (image data). As shown by the polygonal line of
On the other hand, in the case of the use of the polynomial, the weight is expressed by the curve of
In the modification, the polynomial is used as the function (fitting function) which approximates the table value in each pixel value of the feature pixel. Alternatively, a mixed Gaussian distribution may be used as the fitting function. That is, the table value in each pixel value of the feature pixel is approximated by overlapping the plural Gaussian distributions.
Assuming that w(n) is the weight of the feature pixel F(n) and i(n) is the pixel value of the feature pixel F(n), the fitting function with the mixed Gaussian distribution is expressed by the following equation (2).
w(n)=Σam·exp{(i(n)·bm)/cm} (2)
Assuming that M is the number of Gaussian distributions, am (m=1, 2, . . . , and M) is a composite coefficient, bm (m=1, 2, . . . , and M) is an average, and cm (m=1, 2, . . . , and M) is a variance. These parameters are stored in the coefficient table.
In embodiments from a second embodiment, the case where the weighting table is used will be described in the weighting table and the coefficient table. However, the coefficient table may be used.
A second embodiment is characterized in that the number of kinds of the generated reduced images can be decreased compared with the first embodiment although the number of kinds of detectable face sizes is equal to that of the first embodiment.
First the input image is obtained (Step S51), and one or plural reduced images are generated from the input image (Step S52). The edge feature image is generated in each of the four directions in each hierarchical image which is formed by the input image and the reduced image (Step S53). The face detecting process is performed using each edge feature image and the weighting table (Step S54), and the detection result is delivered (Step S55). When the command for ending the face detection is not inputted (Step S56), the flow returns to Step S51. When the command for ending the face detection is inputted in Step S56, the flow is ended.
In the reduced image generating process in Step S52, as shown in
In the second embodiment, because the number of kinds of the detectable face sizes is equalized to that of the first embodiment, the face detection is performed using the determination regions 51, 52, and 53 having the three kinds of the face sizes. The sizes of the determination regions 51, 52, and 53 are set to T1 by T1, T2 by T2, and T3 by T3 matrixes respectively, and the reduction ratio used in the first embodiment is set to R. Then, T1, T2, and T3 are determined such that the following equation (3) holds.
T1=R×T2
T2=R×T3
T1=R2×T3 (3)
Letting R=0.8 and T1=24 leads to T2=30 and T3=37.5. However, T3 is set to 36 from the standpoint of convenience on the computation. The three kinds of the weighting tables are previously produced according to the three kinds of the determination regions and stored in the memory.
As with the first embodiment, the face detecting process in Step S54 is performed in each hierarchical image. However, the face detecting process is performed to each hierarchical image using the three kinds of the determination regions 51, 52, and 53.
In the second embodiment, the face detecting process is performed to each of the three kinds of the determination regions 51, 52, and 53.
The face detecting process which is performed to the determination region 51 having the T1 by T1 matrix in the input image includes determination steps from a first determination step (Step S61) to a fifth determination step (Step S65). The determination steps differ from one another in the number of feature pixels N used for the determination. In the first determination step (Step S61) to the fifth determination step (Step S65), the numbers of feature pixels N used for the determination are set to N1 to N5 (N1<N2<N3<N4<N5) respectively. When the face is not detected in a certain determination step, the flow does not go to the next determination step, but it is determined that the face does not exist in the determination region. Only when the face is detected in all the determination steps, it is determined that the face exists in the determination region 51. The determination process performed in each determination step is similar to that of
As with the face detecting process performed to the determination region 51, the face detecting process which is performed to the determination region 52 having the T2 by T2 matrix in the input image also includes determination steps from a first determination step (Step S71) to a fifth determination step (Step S75). As with the face detecting process performed to the determination region 51, the face detecting process which is performed to the determination region 53 having the T3 by T3 matrix in the input image also includes determination steps from a first determination step (Step S81) to a fifth determination step (Step S85).
In the second embodiment, because the number of reduced images is smaller than that of the first embodiment, the processing amount is remarkably decreased in both the reduction process and the process of generating the edge feature image of each of the four directions. On the contrary, because the face detecting process is performed in each of the plural kinds of the determination regions having the different sizes, the number of the face detecting processes is increased for one image when all the determination steps are processed. However, for the determination region where the face does not exist, in the first-half determination steps in which the few number of feature pixels is used, it is determined that the face does not exist. Therefore, the processing can be realized at relatively high speed. As a result, when compared with the first embodiment, the overall processing amount is decreased to achieve the high-speed processing.
(1-1) In the Case of −90°, +90°, and +180° Rotation Angles
An image 61 of
In order to detect the faces having the different rotation angle positions using the one kind of the weighting table produced for the default rotational position, it is necessary that the input image is rotated to generate the edge feature images of the four directions to the post-rotation image. However, the processing amount is increased because not only the rotating process is required but also the edge feature image is generated in each post-rotation image.
Alternatively, the weighting table is prepared for other rotation angle positions (+90°, −90°, 180°) in addition to the weighting table produced for the default rotational position, and the face detection may be performed to the determination region having any position in each rotation angle position using the corresponding weighting table. In this method, it is not necessary to rotate the image, but it is necessary to produce and hold the weighting table for each rotation angle position.
The third embodiment is characterized in that the faces having the different rotation angle positions can be detected without rotating the input image using the one kind of the weighting table produced for the default rotational position.
In the upper portion of
In the four-direction edge feature images corresponding to the face image which is rotated by +90°, the feature points a to f assigned in the weighting table emerge as shown in the lower portion of
The feature point e corresponding to the obliquely upper right edge direction assigned in the weighting table emerges in the obliquely upper left edge feature image in the edge feature images corresponding to the face image which is rotated by +90°. The feature point f corresponding to the obliquely upper left edge direction assigned in the weighting table emerges in the obliquely upper right edge feature image in the edge feature images corresponding to the face image which is rotated by +90°.
Assuming that x and y are an xy coordinate of the feature point assigned in the weighting table while X and Y are an xy coordinate of the feature point in the edge feature image corresponding to the face image which is rotated by +90°, the relationship between the xy coordinates becomes the relationship between a point P and a point P2 of
X=Tx·y
Y=x (4)
As shown in
A relationship shown in Table 1 holds between the position (q,y,x) of the feature point assigned in the weighting table and the position (Q,Y,X) of the corresponding feature point on the face image (edge feature image) which is rotated by +90°. Similarly a relationship shown in Table 1 holds between the position (q,y,x) of the feature point assigned in the weighting table and the position (Q,Y,X) of the corresponding feature point on the face image (edge feature image) which is rotated by −90° or 180°. In the face detection in which models such as a profile face and an oblique face are used, sometimes the detection-target face image is flipped horizontally or vertically. There is a relationship shown in Table 1 between the position (q,y,x) of the feature point assigned in the weighting table and the position (Q,Y,X) of the corresponding feature point on the horizontally or vertically-flipped face image (edge feature image).
The relationship of the point P and the point P1 shown in
Using the weighting table produced for the default rotational position, the face image in which the default face image is rotated by +90°, −90°, or 180° and the face image (edge feature image) in which the default face image is horizontally or vertically flipped can be detected by utilizing the relationships in Table 1.
Specifically, for example, in the case where the face image which is rotated by +90° is detected, when the feature pixel F(n) is selected in Step S33 of
(1-2) In the Case of −45°, +45°, +135°, and −135° Rotation Angles
An image 71 of
In the upper portion of
In the four-direction edge feature images corresponding to the face image which is rotated by +45°, the feature points a to f assigned in the weighting table emerge as shown in the lower portion of
That is, the feature point e corresponding to the obliquely upper right edge direction assigned in the weighting table emerges in the horizontal edge feature image in the edge feature images corresponding to the face image which is rotated by +45°. The feature point f corresponding to the obliquely upper left edge direction assigned in the weighting table emerges in the vertical edge feature image in the edge feature images corresponding to the face image which is rotated by +45°.
Assuming that x and y are an xy coordinate of the feature point assigned in the weighting table while X and Y are an xy coordinate of the feature point in the edge feature image corresponding to the face image which is rotated by +45°, the relationship between the xy coordinates becomes the relationship between the point P and the point P1 of
X=(Ty+x·y)/√2
Y=(x+y)/√2 (5)
As shown in
A relationship shown in Table 2 holds between the position (q,y,x) of the feature point assigned in the weighting table and the position (Q,Y,X) of the corresponding feature point on the face image (edge feature image) which is rotated by +45°. Similarly a relationship shown in Table 2 holds between the position (q,y,x) of the feature point assigned in the weighting table and the position (Q,Y,X) of the corresponding feature point on the face image (edge feature image) which is rotated by −45°, +135° or −135°.
The relationship of the point P and the point P2 shown in
Using the weighting table produced for the default rotational position, the face image in which the default face image is rotated by +45°, −45°, +135°, or −135° can be detected by utilizing the relationships in Table 2.
Specifically, for example, in the case where the face image which is rotated by +45° is detected, when the feature pixel F(n) is selected in Step S33 of
A fourth embodiment is improvement of the second embodiment described with reference to
The fourth embodiment differs from the second embodiment in contents of the face detecting process of Step S54 in Steps S51 to S56 of
As described in the second embodiment, in the reduced-image generating process of Step S52, the reduced image 33 is generated from the input image 30 using a reduction ratio RM=R3 three times the reduction ratio R of the first embodiment as shown in
The face detecting process performed in Step S54 will be described below. In
In
Tp1=R3×T1≅0.5T1
Tp2=R3×T2≅0.5T2
Tp3=R3×T3≅0.5T3 (6)
When the Tp1, Tp2, and Tp3 are set in the above-described manner, the face size which can be detected from the hierarchical image p+1 using the determination region 51 is equalized to the face size which can be detected from the hierarchical image p using the determination region 54. Similarly the face size which can be detected from the hierarchical image p+1 using the determination region 52 is equalized to the face size which can be detected from the hierarchical image p using the determination region 55. Similarly the face size which can be detected from the hierarchical image p+1 using the determination region 53 is equalized to the face size which can be detected from the hierarchical image p using the determination region 56.
The six kinds of the weighting tables are previously produced according to the six kinds of the determination regions 51 to 56 and stored in the memory.
In Step S54 (see
The fourth embodiment differs from the second embodiment in that the rough detection is performed as the pre-processing using the hierarchical image p.
In Steps S61 to S65 of
In Step S91, a roughly-detecting process is performed prior to Step S61. In Step S91, the face detecting process is performed to the determination region 54 having the Tp1 by Tp1 matrix in the hierarchical image p using the predetermined number of feature pixels Na. The procedure of the face detecting process in Step S91 is shown in
In Step S92, the roughly-detecting process is performed prior to Step S71. In Step S92, the face detecting process is performed to the determination region 55 having the Tp2 by Tp2 matrix in the hierarchical image p using the predetermined number of feature pixels Nb. The procedure of the face detecting process in Step S92 is shown in
In Step S93, the roughly-detecting process is performed prior to Step S81. In Step S93, the face detecting process is performed to the determination region 56 having the Tp3 by Tp3 matrix in the hierarchical image p using the predetermined number of feature pixels Nc. The procedure of the face detecting process in Step S93 is shown in
The face detecting process performed to the hierarchical image p is similar to that of the second embodiment although the fourth embodiment differs from the second embodiment in the size of the determination region.
According to the fourth embodiment, in the case where the face detecting process is performed to the hierarchical image p+1 having the larger size, the rough detection is performed as the pre-processing using the lower-hierarchical image p having the number of pixels smaller than that of the hierarchical image p+1. Therefore, in the case where the face is not detected in the rough detection, the processing speed is enhanced because the process performed to the hierarchical image p+1 can be neglected.
A fifth embodiment is improvement of the second embodiment described with reference to
The fifth embodiment differs from the second embodiment in contents of the face detecting process of Step S54 in Steps S51 to S56 of
As described in the second embodiment, in the reduced-image generating process of Step S52, the reduced image 33 is generated from the input image 30 using a reduction ratio RM=R3 three times the reduction ratio R of the first embodiment as shown in
In
The face detecting process performed in Step S54 (see
In the fifth embodiment, as with the second embodiment, the three kinds of the weighting tables are also stored in the memory according to the size of the determination regions 51, 52, and 53. Additionally, in the fifth embodiment, a common weighting table used for the rough detection is previously produced and held. As shown in
In Step S54 (see
In Steps S61 to S65 of
In the face detecting process, the rough detection is performed to the determination region 57 having the Tc by Tc matrix in the hierarchical image using the common weighting table (Step S101). The feature pixels used in Step S101 are previously obtained. When the face is not detected in the rough detection, it is determined that the face does not exist in the determination region, and the usual determination process is neglected for the determination region. The processes (processes from. Step S61, processes from Step S71, and processes from Step S81) similar to those of the second embodiment are performed only in the case where the face is detected in the rough detection.
According to the fifth embodiment, in the case where the face detecting process is performed to the hierarchical image, the rough detection is performed as the pre-processing using the common weighting table. Therefore, in the case where the face is not detected in the rough detection, the processing speed is enhanced because the usual determination process can be neglected.
A sixth embodiment is improvement of the second embodiment described with reference to
The sixth embodiment differs from the second embodiment in contents of the face detecting process of Step S54 in Steps S51 to S56 of
As described in the second embodiment, in the reduced-image generating process of Step S52, the reduced image 33 is generated from the input image 30 using a reduction ratio RM=R3 three times the reduction ratio R of the first embodiment as shown in
The face detecting process performed in Step S54 (see
In
Assuming that Tpc by Tpc is the size of the determination region 58, Tpc is set in the size expressed by the following equation (7).
Tpc=R3×T3≈0.5T3 (7)
In the sixth embodiment, as with the second embodiment, the three kinds of the weighting tables are also stored in the memory according to the size of the determination regions 51, 52, and 53. Additionally, in the sixth embodiment, a common weighting table used for the rough detection corresponding to the determination region 58 on the hierarchical image p is previously produced and held. The common weighting table is produced as described in the fifth embodiment. Accordingly, in the case where the face detection is performed using the common weighting table, it can roughly be detected whether or not one of the three face images having the different sizes exists.
In Step S54 (see
In Steps S61 to S65 of
In the face detecting process, the rough detection is performed to the determination region 58 having the Tpc by Tpc matrix in the hierarchical image p using the common weighting table (Step S102). The feature pixels used in Step S102 are previously obtained. When the face is not detected in the rough detection, it is determined that the face does not exist in the determination region, and the usual determination process is neglected for the determination region. The processes (processes from Step S61, processes from Step S71, and processes from Step S81) similar to those of the second embodiment are performed only in the case where the face is detected in the rough detection.
As with the second embodiment, the roughly-detecting process is performed to the hierarchical image p. According to the sixth embodiment, in the case where the face detecting process is performed to the hierarchical image p+1 having the larger size, the rough detection is performed as the pre-processing to the lower-hierarchical image p having the number of pixels smaller than that of the hierarchical image p+1 using the common weighting table. Therefore, in the case where the face is not detected in the rough detection, the processing speed is enhanced because the usual determination process can be neglected for the hierarchical image p+1.
A seventh embodiment is improvement of the second embodiment described with reference to
The seventh embodiment differs from the second embodiment in contents of the face detecting process of Step S54 in Steps S51 to S56 of
As described in the second embodiment, in the reduced-image generating process of Step S52, the reduced image 33 is generated from the input image 30 using a reduction ratio RM=R3 three times the reduction ratio R of the first embodiment as shown in
In
In
Assuming that Tpc by Tpc is the size of the determination region 58, Tpc is set in the size expressed by the following equation (8).
Tpc=R3×T3≅0.5T3 (8)
In Step S53 (see
The detection process performed in Step S54 (see
In the seventh embodiment, as with the second embodiment, the three kinds of the weighting tables are also stored in the memory according to the size of the determination regions 51, 52, and 53. Additionally, in the seventh embodiment, not only a second common weighting table corresponding to the second roughly-detecting determination region 57 is previously produced and held, but also a first common weighting table corresponding to the first roughly detecting determination region 58 is previously produced and held. These common weighting tables are generated as described in the fifth embodiment.
In Step S54 (see
In Steps S61 to S65 of
In the face detecting process, the first rough detection is performed to the first roughly detecting determination region 58 having the Tpc by Tpc matrix in the hierarchical image p using the first roughly-detecting common weighting table (Step S201). The feature pixels used in Step S201 are previously obtained. When the face is not detected in the first rough detection, it is determined that the face does not exist in the determination region, and the usual determination process is neglected for the determination region.
When the face is detected in the first rough detection, the second rough detection is performed to the second roughly detecting determination region 57 having the Tc by Tc matrix in the hierarchical image p+1 using the second roughly-detecting common weighting table (Step S202). The feature pixels used in Step S202 are previously obtained. When the face is not detected in the second rough detection, it is determined that the face does not exist in the determination region, and the usual determination process is neglected for the determination region. The processes (processes from Step S61, processes from Step S71, and processes from Step S81) similar to those of the second embodiment are performed only in the case where the face is detected in the second rough detection.
As with the second embodiment, the face detecting process is performed to the hierarchical image p. According to the seventh embodiment, in the case where the face detecting process is performed to the hierarchical image p+1 having the larger size, the first rough detection is performed as the pre-processing to the lower-hierarchical image p having the number of pixels smaller than that of the hierarchical image p+1 using the common weighting table, and the second rough detection is performed to the hierarchical image p+1 using the second common weighting table. Therefore, in the case where the face is not detected in the rough detection, the processing speed is enhanced because the usual determination process can be neglected for the hierarchical image p+1.
In the above embodiments, for convenience of explanation, the face is detected using the weighting table (or coefficient table) for the frontal face.
In order to enhance the face detection accuracy, a first face detecting process performed using a weighting table (or coefficient table) for the frontal face, a second face detecting process performed using a weighting table (or coefficient table) for the profile face, and a third face detecting process performed using a weighting table (or coefficient table) for the oblique face are separately performed, and it is determined that the face exists when the face is detected in one of the face detecting processes.
As shown in
For convenience of explanation, it is assumed that the first face detecting process performed using the weighting table (or coefficient table) for the frontal face includes two-stage determination step (Step S301 and Step S302). The first-stage determination step (Step S301) differs from the second-stage determination step (Step S302) in the number of feature pixels used in the determination. That is, the number of feature pixels used in the second-stage determination step (Step S302) is larger than the number of feature pixels used in the first-stage determination step (Step S301).
Similarly, it is assumed that the second face detecting process performed using the weighting table (or coefficient table) for the profile face includes two-stage determination step (Step S401 and Step S402), and it is assumed that the third face detecting process performed using the weighting table (or coefficient table) for the frontal face includes two-stage determination step (Step S501 and Step S502).
The first-stage determination step (Step S301) of the first face detecting process, the first-stage determination step (Step S401) of the second face detecting process, and the first-stage determination step (Step S501) of the third face detecting process are performed.
When the face is not detected in all Steps S301, S401, and S501, it is determined that the face does not exist. When the face is detected in one of Steps S301, S401, and S501, the flow goes to Step S600.
In Step S600, on the basis of the score S computed in one of Steps S301, S401, and S501 in which the face is detected, it is determined which process should be continued. That is, in the score S computed in the first-stage determination step in which the face is detected, the kind of the face detecting process (first to third face detecting processes) corresponding to the determination step having the largest score S is specified. Then, in the specified face detecting process, the flow goes to the second-stage determination step.
For example, in the case where the faces are detected in all Steps S361, S401, and S501, when the score S computed in Step S301 has the largest one in the scores S computed in Steps S301, S401, and S501, the flow goes to Step S302 which is of the second-stage determination step of the first face detecting process. In this case, the second-stage determination steps are not performed in the second and third face detecting processes.
Number | Date | Country | Kind |
---|---|---|---|
2006-053304 | Feb 2006 | JP | national |
2006-354005 | Dec 2006 | JP | national |
Number | Name | Date | Kind |
---|---|---|---|
5870502 | Bonneau et al. | Feb 1999 | A |
6421463 | Poggio et al. | Jul 2002 | B1 |
6453069 | Matsugu et al. | Sep 2002 | B1 |
6711279 | Hamza et al. | Mar 2004 | B1 |
20020102024 | Jones et al. | Aug 2002 | A1 |
20040228505 | Sugimoto | Nov 2004 | A1 |
20050280809 | Hidai et al. | Dec 2005 | A1 |
Number | Date | Country |
---|---|---|
08249466 | Sep 1996 | JP |
2000-134638 | May 2000 | JP |
2000134638 | May 2000 | JP |
2002304627 | Oct 2002 | JP |
2004334836 | Nov 2004 | JP |
2005025568 | Jan 2005 | JP |
2005056124 | Mar 2005 | JP |
2005157679 | Jun 2005 | JP |
2005235089 | Sep 2005 | JP |
Number | Date | Country | |
---|---|---|---|
20070201747 A1 | Aug 2007 | US |