This application relates to and claims priority from Japanese Patent Application No. 2008-194379, filed on Jul. 29, 2008 the entire disclosure of which is incorporated herein by reference.
The present invention generally relates to an image information processing method and apparatus, which detect a facial image region in an input image and execute a prescribed image processing operation.
A procedure designed to make it possible to automatically detect whether or not a person in a video has a specific intention (is suspicious) has been developed in the past. This procedure performs the above-mentioned detection by extracting a plurality of information indicating behavioral intent from information related to the eye-gaze direction of the above-mentioned person, and a determination is made as to whether or not the above-mentioned person is suspicious by extracting gaze feature values (total time gazing in a specific direction; number of eye-gaze transitions; average gaze time; number of glimpses; eye-gaze variance value; average amount of eye-gaze movement; distance of general eye-gaze pattern) from the information related to the extracted eye-gaze direction (For example refer to Japanese Patent Application Laid-open No. 2007-6427).
A procedure designed to make it possible to accurately detect eye-gaze direction from an imaged facial image has also been developed in the past. This procedure detects the shape of the iris in an image in order to detect the center location of the pupil, and an example of this includes using an elliptical Hough transform to find the elliptical contour of the iris. Generally speaking, this procedure is applied because measuring the center location of the pupil requires detecting the contour of the iris and estimating the center of the pupil from the center location of this iris (For example, refer to Japanese Patent Publication No. 3797253).
The technique disclosed in Japanese Patent Application Laid-open No. 2007-6427 measures the eye-gaze direction of the person by measuring the angle formed between the center location of the eyeball and the center location of the pupil, but a specific technical procedure for detecting the above-mentioned center location of the pupil is not disclosed.
The technique disclosed in Japanese Patent Publication No. 3797253 requires that the contour of the iris be detected, but since the actual iris contour is partially hidden by the eyelid, the overall shape of the iris (the shape of the contour) does not appear in the image. For this reason, it is impossible to detect the entire shape of the contour of the iris from the above-mentioned image.
For example, in a case where an image processing such as the above-mentioned elliptical Hough transform image processing is carried out to detect the contour of the iris in an image of a person, the entire contour of the iris does not appear in the image, giving rise to problems, such as detecting only the part of the iris contour that appears in the image, and mistakenly detecting the peripheral part of the white part of the eye. The resultant problem is reduced accuracy in detecting the center location of the pupil, which is dependent on the detection of the contour of the iris, and the occurrence of major errors in measuring the eye-gaze direction of the person in the above-mentioned image.
Therefore, an object of the present invention is to make it possible to accurately measure the center location of the iris of a person in an image of this person that has been taken, and to realize a highly accurate eye-gaze direction measurement.
An image information processing apparatus according to a first aspect of the present invention comprises a facial image region detection unit that detects a facial image region in an input image; an eyelid region contour detection unit that detects a contour of an eyelid region in the facial image region detected by the above-mentioned facial image region detection unit; and an iris region detection unit that detects a shape of the iris region in the above-mentioned facial image region on the basis of the above-mentioned eyelid region contour detected by the above-mentioned eyelid region contour detection unit, and a location of a pupil related to the above-mentioned iris region is estimated on the basis of the shape of the iris region detected by the above-mentioned iris region detection unit.
In the preferred embodiment related to the first aspect of the present invention, the above-mentioned eyelid region contour detection unit creates an eyelid contour model on the basis of the size of the above-mentioned facial image region detected by the above-mentioned facial image region detection unit, and detects the contour of the eyelid region by detecting contour points of the eyelid region in the above-mentioned facial image region using this created eyelid contour model and by determining the location of the eyelid region.
In an embodiment that is different from the one mentioned above, the above-mentioned eyelid region contour detection unit determines a location, which is evaluated as the most likely eyelid region in the above-mentioned facial image region, to be the location of the above-mentioned eyelid region by superposing the above-mentioned eyelid contour model onto the above-mentioned facial image region.
In an embodiment that is different from those mentioned above, the above-mentioned evaluation is performed by extracting a feature value for each of the contour point locations in the above-mentioned eyelid contour model and calculating the likelihoods of these locations as contour points, and determining the sum of the calculated likelihoods.
In an embodiment that is different from those mentioned above, the above-mentioned eyelid region location determination by the above-mentioned eyelid region contour detection unit includes processing for searching, within respective prescribed ranges, for the optimum location of each contour point for demarcating the contour of the above-mentioned eyelid region, and updating the locations of the above-mentioned respective contour points determined in the previous search to locations to be determined as more appropriate.
Another embodiment that is different from those mentioned above further comprises a face direction measurement unit that measures a value indicating the direction in which the face of a person imaged in the above-mentioned facial image is oriented from the facial image region detected by the above-mentioned facial image region detection unit.
In another embodiment that is different from those mentioned above, the above-mentioned eyelid region contour detection unit creates an eyelid contour model on the basis of the size of the above-mentioned facial image region detected by the above-mentioned facial image region detection unit and the value indicating the direction in which the above-mentioned face is oriented as measured by the above-mentioned face direction measurement unit, and detects the contour of the eyelid region by detecting contour points of the eyelid region in the above-mentioned facial image region using this created eyelid contour model and by determining the location of the eyelid region.
In another embodiment that is different from those mentioned above, the above-mentioned iris region detection unit estimates the shape of an eyeball region in the above-mentioned facial image region on the basis of the value indicating the direction in which the above-mentioned face is oriented as measured by the above-mentioned face direction measurement unit and a prescribed eyelid contour model, and searches for the iris region location on the basis of this estimated eyeball region shape and determines the center location of the iris region and the shape of the iris region.
In yet another embodiment that is different from those mentioned above, at least respective data of the center of the eyeball region, the radius of the eyeball region, and the radius of the iris region is used in the process for estimating the shape of the above-mentioned eyeball region.
An image information processing method according to a second aspect of the present invention comprises a first step of detecting a facial image region in an input image; a second step of detecting a contour of an eyelid region in the facial image region detected in the above-mentioned first step; a third step of detecting a shape of an iris region in the above-mentioned facial image region on the basis of the above-mentioned eyelid region contour detected in the above-mentioned second step; and a fourth step of estimating the location of a pupil related to the above-mentioned iris region on the basis of the shape of the iris region detected in the above-mentioned third step.
The embodiments of the present invention will be explained in detail below in accordance with the drawings.
The above-mentioned image information processing system, as shown in
In
The image input unit 1, under the control of the CPU 5, inputs the image information output from the above-mentioned imaging device 100, and, in addition, outputs this input image information to the image memory 3 via the bus line 15.
The image memory 3, under the control of the CPU 5, stores in a prescribed storage area the above-mentioned image information output via the bus line 15 from the image input unit 1. The image memory 3 also outputs the stored image information to the CPU 5 via the bus line 15 in accordance with an information read-out request from the CPU 5.
In the RAM 7, for example, there is provided a storage area needed for the CPU 5 to deploy the above-mentioned image information when the CPU 5 executes a prescribed arithmetic processing operation for the above-mentioned image information.
The ROM 9 is equipped with a control program required for the CPU 5 to control and manage the operations of the respective parts comprising the eye-gaze direction measurement device 200, and stores nonvolatile fixed data. The ROM 9 also outputs the above-mentioned stored nonvolatile fixed data to the CPU 5 via the bus line 15 in accordance with a data read-out request from the CPU 5.
The measurement result recording unit 11 records various types of data obtained in accordance with the CPU 5 carrying out a prescribed arithmetic processing operation with respect to the above-mentioned image information, for example, measurement data related to a person's eye-gaze direction included in the above-mentioned image information. The measurement result recording unit 11 outputs the above-mentioned recorded measurement data to the output device 300 via the bus line 15 and output I/F 13 in accordance with a data read-out request from the CPU 5. The above-mentioned arithmetic processing operation by the CPU 5 and the measurement data recorded in the measurement result recording unit 11 will be described in detail hereinbelow.
The output I/F 13, under the control of the CPU 5, connects to the output device 300, and outputs to the output device 300 the above-mentioned measurement data output via the bus line 15 from the measurement result recording unit 11.
For example, a display (monitor), printer, and PC (refers to a personal computer, both here and below) are utilized in the output device 300. In a case where the output device 300 is a monitor, image information captured by the imaging device 100 is displayed and output as visible image information. In a case where the output device 300 is a printer, a hard copy related to the above-mentioned measurement data output from the measurement results recording unit 11 via the output I/F 13 is output. And in a case where the output device 300 is a PC, the above-mentioned measurement data is output in a mode that is recognizable to the user.
The CPU 5, as shown in
The face detection unit 21 detects the facial image of a person included in the input image information by carrying out prescribed image processing for the input image information. In a case where there is a plurality of facial images in the above-mentioned input image information, the face detection unit 21 executes a detection operation for the number of facial images mentioned above. The above-mentioned person facial image information detected by the face detection unit 21 is respectively output to the eyelid contour detection unit 23 and the face direction measurement unit 27.
The eyelid contour detection unit 23 detects the contours of the eyelid regions of both the right and left eyes in the region of the facial image output by the face detection unit 21 in accordance with carrying out prescribed image processing. This detection result is output to the iris detection unit 25 from the eyelid contour detection unit 23. Furthermore, in a case where there is a plurality of facial images detected by the face detection unit 21, the above-mentioned processing operation is executed in the eyelid contour detection unit 23 for each of the above-mentioned facial images.
The faced direction measurement unit 27 inputs the above-mentioned facial image information output from the face detection unit 21, and carries out measurement processing on the face direction (orientation of the face) of this facial image. The data obtained in accordance with this measurement is output to the eye-gaze direction calculation unit 29 from the face direction measurement unit 27.
The iris detection unit 25 detects the shape of the iris region based solely on the information related to the iris region and sclera region, which are the inner side parts of the eyelid region (the parts of the imaging regions covered by this eyelid region) by using the contour information of the eyelid region output from the eyelid contour detection unit 23 to execute a prescribed processing operation. Furthermore, in a case where there is a plurality of facial images detected by the face detection unit 21, the above-mentioned processing operation in the iris detection unit 25 is executed for each of the above-mentioned respective facial images. The iris region shape data obtained by the iris detection unit 25 is output to the eye-gaze direction calculation unit 29 from the iris detection unit 25.
The eye-gaze direction calculation unit 29 calculates the eye-gaze direction in the above-mentioned person's facial image by carrying out prescribed arithmetic processing on the basis of the above-mentioned iris region shape data output from the iris detection unit 25 and face direction measurement data output from the face direction measurement unit 27. According to the above-mentioned configuration, it is possible to measure the eye-gaze direction of the person's facial image having this iris region on the basis of highly accurate iris region shape information obtained in the iris shape detection unit 19. The iris region part and the sclera region part may also be estimated based on the detected eyelid region contour information, making it possible to accurately estimate the shape of the iris region.
As shown in
Line 49 shows a relationship between contour point 35 and contour point 47, line 51 shows a relationship between contour point 35 and contour point 45, and line 53 shows a relationship between contour point 37 and contour point 47. Further, line 55 shows a relationship between contour point 37 and contour point 45, line 57 shows a relationship between contour point 37 and contour point 43, line 59 shows a relationship between contour point 39 and contour point 45, and line 61 shows a relationship between contour point 39 and contour point 43.
In
In
Next, a processing operation for determining the location of the eyelid regions in the respective images of the left and right eyes is executed. In this processing operation, the eyelid contour detection unit 23 superposes the eyelid contour model 31 onto the peripheral region of the region of the eye in the above-mentioned input facial image as described hereinabove, and evaluates the so-called likelihood of an eyelid region, and determines the location with the highest evaluation, that is, the location determined to be the most likely eyelid region in the peripheral region of this eye region, as the eyelid region. This evaluation, for example, is carried out by extracting the various feature values for each contour point location in the above-described eyelid contour model 31 and calculating the likelihood of the location as a contour point, and, in addition, determining the sum of the calculated likelihoods. For example, an edge value, a Gabor filter value, template matching similarities and the like are used as the feature values for each of the above-mentioned contour point locations (Step S73).
Next, the eyelid contour detection unit 23 executes processing for searching for the optimum locations of the respective contour points in the above-mentioned eyelid contour model 31. The eyelid region contour shape will differ for each person, and will also differ for the same person in accordance with changes in facial expression, making it necessary to search for the precise location of each of the respective contour points. In this processing operation, the updating of the locations of the respective contour points is performed by searching only for the optimum location within a specified range for each contour point (Step S74).
When the processing operation of the above-described Step S74 ends, the eyelid contour detection unit 23 checks to determine whether or not to carry out the processing operation in Step S74 once again, that is, whether or not to update the locations of the above-mentioned respective contour points (Step S75). As a result of this check, if it is determined that updating should be carried out once again (Step S75: YES), the eyelid contour detection unit 23 returns to Step S74. Conversely, if it is determined as a result of this check that there is no need to carry out updating again (Step S75: NO), the eyelid contour detection unit 23 determines that the above-mentioned respective contour point locations updated in Step S74 are the final locations. That is, the locations of the above-mentioned respective contour points updated in Step S74 constitute the detection results of the locations of the respective contour points required to configure the contour of the eyelid region in the above-mentioned eye region (Step S76).
Furthermore, when the determination in Step S75 is not to update again (Step S75: NO), it is either a case in which the above-mentioned contour point locations were not updated even once in Step S74, or a case in which the number of processes in Step S74 exceeded a predetermined number of times.
In accordance with the above-described processing flow, detection of the contour of the eyelid region becomes possible even when the shape of the eyelid region contour has changed significantly as a result of the person, whose facial image has been input, being switched, or the same person having significantly changed his facial expression. That is, it is possible to accurately detect the contour of the eyelid region in the region of the eye by splitting up the processing operation into an operation for determining the location of the eyelid region within the eye region shown in Step S73, and an operation for updating the locations of the respective contour points in the eyelid region shown in Step S74. It is also possible to determine the locations of the above-mentioned respective contour points without losing much of the shape capable of being obtained as the eyelid contour by repeating the processing for gradually updating the locations of all the contour points demarcating the eyelid region in Steps S74 and S75.
In
Edge image processing is carried out solely for the above-mentioned eye region 81 in which only the iris region and sclera region have been detected. Consequently, it is possible to obtain an edge image in which only the edge of the border part between the iris region and the sclera region is revealed without revealing an edge for the noise outside of the above-mentioned eye region 81 of the contour line part of the eyelid region. Carrying out an elliptical Hough transform for the above-mentioned edge image makes it possible to accurately estimate the shape of the iris region from the border part between the iris region and the sclera region.
For example, an iris region like that shown in
In a case where the contour line of the eyelid contour model 31 is not removed from the above-mentioned eye region 81, an iris region like that shown in
Furthermore, in the face direction measurement unit 27 (shown in
The method for calculating the eye-gaze direction of the facial image will be explained in detail below.
If it is assumed here that the radius of the eyeball region, which is the target of the eye-gaze direction calculation, is r, that the center location of this eyeball region is O, and that the center location of the pupil in this eyeball region is I, the eye-gaze direction of the facial image is calculated in accordance with (Numerical Expression 1) below.
In (Numerical Expression 1), φeye denotes the horizontal direction component of the eye-gaze direction, and θeye denotes the vertical direction component of the eye-gaze direction, respectively. Further, Ix denotes the X coordinate of the pupil center location I, Iy denotes the Y coordinate of the pupil center location I, Ox denotes the x coordinate of the eyeball region center location O, and Oy denotes the y coordinate of the eyeball region center location O. The pupil center location I here may be estimated from the iris region center location C and the eyeball center location O.
Employing the above calculation method makes it possible to accurately detect the shape of the iris region even when a portion of the iris region is concealed inside the contour of the eyelid region or when a shadow caused by eyeglasses or illumination appears in the perimeter of the eye region by removing the effects thereof, thereby enabling the eye-gaze direction of the facial image to be accurately measured.
The facial image depicted in
The reason for setting the search ranges for the respective contour points (33 through 47) as described above is because of the large angle that is formed between a contour point located in the direction in which the face is oriented and the center of the face, and the fact that the location of the contour point changes little in accordance with the rotation of the face, a narrow search range is set. In contrast to this, since the location of a contour point, which is located opposite the direction in which the face is oriented, will change greatly in accordance with a change in the direction in which the face is oriented, a wide search range must be set.
In this embodiment, the above-mentioned eyelid contour detection unit (23) acquires a value denoting the direction in which the face is oriented in the input facial image from the face direction measurement unit (27) (shown together with this eyelid contour detection unit 23 in
Next, the above-mentioned eyelid contour detection unit (23) creates an eyelid contour model based on the information related to the size of the above-mentioned input facial image and the value indicating the direction in which the above-mentioned face is oriented of Step S111. That is, from the information related to the size of the above-mentioned input facial image and the value indicating the direction in which the above-mentioned face is oriented of Step S111, the eyelid contour detection unit (23) carries out a process that rotates a standard eyelid contour model 31 in conformance to the value indicating the direction in which the above-mentioned face is oriented, and creates an eyelid contour model that approximates the contour shape of the eyelid region in the input facial image (Step S112).
When an eyelid contour model like that described above is created in Step S112, the same processing operations as those shown in
The mode for changing the respective contour points (33 through 47) corresponding to the face direction of the facial image shown in
In accordance with the processing flow described hereinabove, highly accurate detection of the respective contour points for demarcating the contour of the eyelid region in the facial image becomes possible, even when the contour shape of the eyelid region has changed significantly as a result of the direction in which the face is oriented in the input facial image having changed, since it is possible to carry out detection processing by superposing an eyelid contour model that simulates the changed state of the facial image onto this changed-state facial image. Further, the processing operation depicted in Step S114 and the processing operation in Step S115 of
Determining the search range for the locations of the respective contour points in accordance with the above method, estimating the expected range of motion of the contour of the eyelid region in the input facial image, and limiting the search range to within this estimated range makes it possible to detect the contour points of the eyelid region with even higher accuracy. In addition, since the number of times that the search process is carried out is reduced, it is possible to reduce the time required for the above-mentioned search process.
In this embodiment, the center location O of the eyeball region in the above-mentioned imaged facial image, and the radius r of this eyeball region are estimated by using an eyeball model 121 like that shown in
Next, the coordinates (Ox, Oy) denoting this eyeball region center location O are calculated in accordance with the following (Numerical Expression 3) from the radius r of the eyeball region in the above-mentioned imaged facial image determined using (Numerical Expression 2).
In accordance with defining the above-mentioned eyeball model 121 using the mode described above, it becomes possible to estimate the eyeball region radius r in the above-mentioned imaged facial image and the center location O of this eyeball region from the direction in which the face is oriented in the input facial image and the contour shape of the eyelid region within the above-mentioned facial image.
An eyeball estimation model 123 like that shown in
In a case where the rotation of the eyeball places the center location of the iris region in the location denoted by the reference numeral 1251, the estimated iris region corresponding to this center location 1251 is region 1291, and similarly, in a case where the rotation of the eyeball places the center location of the iris region in the location denoted by the reference numeral 1252, the estimated iris region corresponding to this center location 1252 is region 1292. Also, in a case where the rotation of the eyeball places center location of the iris region in the location denoted by the reference numeral 1253, the estimated iris region corresponding to this center location 1253 is region 1293, and similarly, in a case where the rotation of the eyeball places the center location of the iris region in the location denoted by the reference numeral 1254, the estimated iris region corresponding to this center location 1254 is region 1294. Region 133 is the same region as iris region 129 in
The optimum estimated iris region (any one of 1291 through 1294) is determined from among the above-described estimated iris regions (1291 through 1294) by comparing and contrasting the respective estimated iris regions 1291 through 1294 mentioned above with the facial image input to the eye-gaze direction operator (5) (mentioned in
Then, as a result of the above evaluation, the most plausible location of the above-mentioned estimated iris regions (1291 through 1294) in the eyeball estimation model 123 described above is determined to be the location of the iris region in the above-mentioned eyeball estimation model 123.
As mentioned hereinabove, changing the center location of the iris region is equivalent to the person imaged in the above-mentioned facial image changing his eye-gaze direction. For this reason, the eye-gaze direction of the person imaged in the above-mentioned facial image may be determined as-is from a change in the center location of the iris region resulting from the rotation (of the eyeball) based on the center of the eyeball in the eyeball estimation model 123 acquired by the above-mentioned iris detection unit (25) (depicted in
In this embodiment, first, the above-mentioned iris detection unit (25) respectively acquires the value indicating the direction in which the face is oriented in the input facial image from the face direction measurement unit (27) (depicted in
Next, the above-mentioned iris detection unit (25) executes a processing operation that estimates the shape of the eyeball region in the above-mentioned input facial image based on (information related to) the above-mentioned eyelid contour model input in Step S141, and the above-mentioned value indicating the direction in which the face is oriented. In the eyeball region shape estimate executed here, for example, various items, such as the center of the eyeball region, the radius of the eyeball region, and the radius of the iris region are used (Step S142). Next, the location of the iris region in the above-mentioned facial image is searched for on the basis of the values (that is, the radius of the eyeball region and the radius of the iris region) denoting the shape of the eyeball region determined in Step S142 (Step S143). Then, the above-mentioned iris detection unit (25) executes a processing operation for determining the center location of the iris region obtained in Step S143 and the shape of this iris region (Step S144).
According to the processing flow described hereinabove, estimating the shape of the eyeball region in the input facial image in accordance with the direction in which the face is oriented in this facial image makes it possible to search for the location of the iris region in the above-mentioned facial image by predicting a change in the shape of the iris region that could occur in the future, thereby making it possible to detect with a high degree of accuracy the center location in this iris region. Also, since the iris region search range need not be expanded more than necessary in the above-mentioned facial image, it is also possible to speed up the detection rate of the iris region.
Executing the above-described processing operation makes it possible to estimate the iris region with a high degree of accuracy within a self-regulated range in accordance with the shape of the eyeball region. Further, in the process that uses the above-mentioned three-dimensional model (the eyeball estimation model 123), it is possible to use not only the border portion between the iris region and the sclera region, but also all the points comprised within the iris region and within the sclera region in the above-described evaluation, so that, for example, it becomes possible to measure the iris region with a high degree of accuracy even in a case where the resolution of the eye (image) region is low. The problem with the elliptical Hough transform method is that since the above-mentioned evaluation is performed using only the edge points (contour points), fewer (edge) points (contour points) are able to be used in the above-mentioned evaluation in a case where the resolution of the eye (image) region is low, resulting in a higher likelihood of noise at the edge (contour) and degraded measurement accuracy of the iris region. However, using the method related to the above-described embodiment makes it possible to solve for the above problem.
The preferred embodiments of the present invention have been explained above, but these embodiments are examples for describing the present invention, and do not purport to limit the scope of the present invention solely to these embodiments. The present invention is capable of being put into practice in various other modes.
Number | Date | Country | Kind |
---|---|---|---|
2008-194379 | Jul 2008 | JP | national |