The accompanying drawings, which are incorporated in and constitute a part of the specification, illustrate embodiments of the invention, and together with the general description given above and the detailed description of the embodiments given below, serve to explain the principles of the invention.
The preferred embodiments of the present invention will be described hereinafter with reference to the accompanying drawings.
The common summary of the embodiments will be described briefly. For example, as shown in
First, a first embodiment of the invention will be described.
The camera 101 is adapted to capture an image of the pedestrian M which includes at least his or her face and is installed in a first position to capture him or her from a first direction (from the front side). The camera 102 is adapted to capture an image of a wide field of view including the pedestrian M and installed in a second position to capture him or her from a second direction (from above).
Hereinafter, each of the components will be explained.
The first camera 101 is adapted to capture an image including at least the face of the pedestrian M for the purpose of collecting full faces as images for pedestrian identification. The camera comprises a television camera using an imaging device, such as a CCD sensor. The first camera 101 is placed between the A point and the gate device 3 on one side of the walkway 1 as shown in
An image including the full face of the pedestrian M can be obtained by placing the first camera 101 in that way. The image captured is sent to the face region detector 103 as digital light and shade image of 512×512 pixels by way of example.
The second camera 102 is adapted to capture an image of a large field of view including the pedestrian M for the purpose of capturing a person with a larger field of view than the first camera 101. As with the first camera, the second camera comprises a television camera using an imaging device, such as a CCD sensor. The second camera 102 is placed so as to look down from the ceiling so that the area from the A point to the gate device 3 on the walkway 1 is captured as shown in
The face region detector 103 detects the face region of the pedestrian M from the image captured by the first camera 101. The use of a method described in, for example, an article entitled “Face feature point extraction based on combination of shape extraction and pattern collation” by Fukui and Yamaguchi, vol. J80-D-H, No. 8, pp. 2170-2177, 1997 allows the face region to be detected with great accuracy. The detected face region information is sent to the entry candidate selection unit 111.
The person region detector 104 detects a candidate region where a person (pedestrian M) is present from the image captured by the second camera 102. The person region is detected from the difference with background image as in a technique described in, for example, an article entitled “Moving object detection technique using post confirmation” by Nakai, 94-CV90. pp. 1-8, 1994. The detected person region information is sent to the entry candidate selection unit 111.
The face feature extraction unit 105 extracts feature information used at the time of entry or collation. For example, the face region information obtained from the face region detector 103 or the entry candidate selection unit 111 is cut into shapes of a given size with reference to face feature points and the resulting light and shade information is used as the feature information. Here, the light and shade values of a region of m×n pixels is used as they are as the feature information, and m×n dimensional information is used as a feature vector. A partial space is calculated by determining the correlation matrix of the feature vector from those data and determining a normalized orthogonal vector based on the known K-L expansion. The method of calculating the partial space involves determining the correlation matrix (or covariance matrix) of the feature vector and determining the normalized orthogonal vector (characteristic vector) by the K-L expansion of the correlation vector. The partial space is represented by a set of k number of characteristic vectors corresponding to characteristic values and selected in descending order of their magnitude. In this embodiment, the correlation matrix Cd is determined from the feature vector and the matrix Φ of the characteristic vector is determined by diagonalization with the correlation matrix given by Cd=Φd Λd Φd T. The partial space is utilized as face feature information for personal identification. This information is simply entered in advance into the dictionary as dictionary information. As will be described later, the partial space itself may be used as face feature information for identification. The calculated face feature information is sent to the face collation dictionary unit 106 at the time of entry or to the face collation unit 107 at the time of collation.
The face collation dictionary unit 106 is configured to hold face feature information obtained by the face feature extraction unit 105 as dictionary information and calculate a similarity to the person M. The face feature information held in the dictionary is output to the face collation unit 107 as required.
The face collation unit 107 calculates a similarity between the face feature information of the pedestrian M extracted by the face feature extraction unit 105 and each face feature information (dictionary information) stored in the face collation dictionary unit 106. This face collation process can be implemented by using a mutual partial space method described in an article entitled “Face identification system using moving images” by Yamaguchi, Fukui, and Maeda, PRMU97-50, pp. 17-23, 1997-06. The result of face correlation (similarity) is sent to the authentication controller 112.
The display unit 108 is installed in the vicinity of the gate device 3 as shown in
In the presence of two or more persons at the time of entry of dictionary information, the display unit 108 displays, as shown in
The operating unit 109 is adapted to enter selection information for a candidate for entry obtained through visual confirmation of the contents displayed on the display unit 108 at the time of entry and entry instruction information. The operating unit 109 comprises a touch panel integrated with the display unit 108.
The gate controller 110 sends a control signal to the gate device 3 shown in
The entry candidate selection unit 111 performs a process of selecting a person which becomes a candidate for entry at the time of entry of dictionary information. The concrete flow of the selection process will be described below with reference to a flowchart shown in
The entry candidate selection unit 111 initiates an entry candidate selection process upon receipt of entry instruction information from the authentication controller 112 (step S1). The selection unit 111 then detects the number of face regions detected by the face region detector 103 (step S2). According to the result of detection of the number of face regions, the selection unit 111 switches between a first process based on an image captured by the camera 101 and a second process based on images captured by the cameras 101 and 102. When the number of face regions is one, the selection unit selects the first process. When the number of face regions is more than one, the second process is selected.
Specifically, when the number of face regions is one, the entry candidate selection unit 111 outputs information of the detected face region (the image captured by the camera 101) as a candidate for entry to the face feature extraction unit 105 (step S3).
If, on the other hand, the number of face regions is more than one, the entry candidate selection unit 111 outputs display information to the authentication controller 112 for the purpose of visual confirmation (step S4). This display information includes person region information detected by the person region detector 104 (the image captured by the camera 102) and a message to request selection of a candidate for entry. The authentication controller 112, upon receipt of the display information from the entry candidate selection unit 111, sends display control information to the display unit 108. As the result, such an image as shown in
That is, as shown in
A person in charge (manager) visually confirms the displayed contents, then select a candidate for entry from among the detected persons and enters entry candidate select information using the touch buttons 14. The entry candidate select information thus entered is sent to the entry candidate selection unit 111 via the authentication controller 112. According to the entry candidate select information, the entry candidate selection unit 111 selects face region information corresponding to the candidate for entry from among the items of fare region information sent from the face region detector 103 and sends it to the face feature extraction unit 105 (step S6).
The face feature extraction unit 105 extracts face feature information from face region information sent from the entry candidate selection unit 111 and then enters it into the face collation dictionary unit 106 as dictionary information.
The authentication controller 112, which controls the entire device, is adapted to mainly carry out a dictionary information entry process and an authentication process (collation process). The dictionary information entry process will be described first. For example, suppose that the authentication unit initiates the entry process upon receipt of entry instruction information from the operating unit 109. Upon receipt of entry instruction information from the operating unit 109 or passage detect information from the gate controller 110 (the gate device 3), the authentication controller outputs entry instruction information to the entry candidate selection unit 111.
The entry process may be initiated by receiving passage detection information from the gate controller 110 rather than by receiving entry instruction information from the operating unit 109. This will allow unauthorized passers-by to be entered into the dictionary.
Next, the authentication process (collation process) will be described. For example, suppose that the authentication process is initiated when a face region is detected from an input image from the first camera 101 in a situation in which no entry instruction information is received. When the face feature detector 103 detects the fate region of a person from an input image, the authentication controller 112 obtains the similarity of that person from the face collation unit 107. The similarity thus obtained is compared with a preset decision threshold. When the similarity is not less than the threshold, it is decided that the person has been entered in advance. If, on the other hand, the similarity is less than the threshold, it is decided that the person has not been entered. The result of decision is displayed on the display unit 108 and passage control information based on this decision result is output to the gate controller 110.
According to the first embodiment, as described above, when two or more persons are present at the time of entry of dictionary information, a candidate for entry is selected through visual observation by a person in charge, allowing only appropriate persons to be entered into the dictionary unit.
A second embodiment of the invention will be described next.
The configuration of an entrance/exit management system to which a face image read apparatus according to the second embodiment is applied remains basically unchanged from that of the first embodiment (
The entry candidate selection unit 111 carries out a process of selecting a person which becomes a candidate for entry at the time of entry of dictionary information. The concrete flow of the process will be described with reference to a flowchart illustrated in
The entry candidate selection unit 111 initiates the entry candidate selection process upon receipt of entry instruction information from the authentication controller 112 (step S11). The selection unit 111 then detects the number of face regions detected by the face region detector 103 (step S12). According to the result of detection of the number of face regions, the selection unit 111 switches between a first process based on an image captured by the camera 101 and a second process based on images captured by the cameras 101 and 102. When the number of face regions is one, the selection unit selects the first process. When the number of face regions is more than one, the second process is selected.
Specifically, when the number of face regions is one, the entry candidate selection unit 111 outputs the detected face region information (the image captured by the camera 101) as a candidate for entry to the face feature extraction unit 105 (step S13).
If, on the other hand, the number of face regions is more than one, the entry candidate selection unit 111 calculates the distance (Dd) in the direction of depth of the walkway 1 between the person regions using face region information (an image captured by the camera 102) obtained from the face region detector 104 and then determines if the calculated distance Dd is less than a preset threshold (Th1) (step S14). The distance Ds in the direction of depth of the walkway between person regions is defined as shown in
where Ddij=|Ddi−Ddj| and n is the number of persons detected.
If the decision in step S14 is that the distance Dd is less than the threshold Th1, then the same process as in the first embodiment is carried out. That is to say, the entry candidate selection unit 111 outputs display information to the authentication controller 112 in order to allow visual confirmation (step S15). Upon receipt of the display information from the entry candidate selection unit 111, the authentication controller 112 sends display control information to the display unit 108 to display such an image as shown in
That is, as shown in
The person in charge (manager) visually confirms the displayed contents, then select a candidate for entry from among the detected persons and input entry candidate select information using the touch buttons 14. The entry candidate select information thus input is sent to the entry candidate selection unit 111 via the authentication controller 112. According to the entry candidate select information, the entry candidate selection unit 111 selects face region information corresponding to the candidate for entry from among the items of fare region information sent from the fare region detector 103 and sends it to the face feature extraction unit 105 (step S17).
If, on the other hand, the decision in step S14 is that the distance Dd is not less than the threshold Th1, then the entry candidate selection unit 111 selects the person whose distance from the gate 3 is minimum (the person nearest to the gate) as a candidate for entry (step S18). The selection unit then selects face region information corresponding to the candidate for entry from among two or more items of face region information from the fare region detector 103 and outputs it to the face feature extraction unit 105 (step S19).
According to the second embodiment, as described above, if two or more persons are present at the time of entry of dictionary information, switching between the processes is made according to the difference in distance between the persons, allowing only an appropriate person to be entered and moreover the time required for entry to be saved.
A third embodiment of the present invention will be described next.
The configuration of an entrance/exit management system to which a face image read apparatus according to the third embodiment is applied remains basically unchanged from that of the first embodiment (
The entry candidate selection unit 111 carries out a process of selecting a person which becomes a candidate for entry at the time of entry of dictionary information. The concrete flow of the process will be described with reference to a flowchart illustrated in
The entry candidate selection unit 111 initiates the entry candidate selection process upon receipt of entry instruction information from the authentication controller 112 (step S21). The selection unit 111 then detects the number of face regions detected by the face region detector 103 (step S22). According to the result of detection of the number of face regions, the selection unit 111 switches between a first process based on an image captured by the camera 101 and a second process based on images captured by the cameras 101 and 102. When the number of face regions is one, the selection unit selects the first process. When the number of face regions is more than one, the second process is selected.
Specifically, when the number of face regions is one, the entry candidate selection unit 111 outputs the detected face region information (the image captured by the camera 101) as a candidate for entry to the face feature extraction unit 105 (step S23). In more detail, face region information contained in a predetermined number of successive frames of image information captured by the camera 101 over a predetermined time is output as a candidate for entry. That is to say, face region information of a person continuously captured over a predetermined time before a certain time is output as a candidate for entry.
If, on the other hand, the number of face regions detected is two or more, then the entry candidate selection unit 111 selects a candidate for entry in accordance with the selection method in the first or second embodiment (step S24). In more detail, the person-to-person distance is detected on the basis of an image captured by the camera 102. When the person-to-person distance detected is not less than a preset threshold Th2 (when persons are too close to each other, their face regions cannot be detected correctly), face region information contained in a number of successive frames of image information captured by the camera 101 over a predetermined time is output. That is to say, of face region information contained in a number of successive frames, face region information which satisfies the condition that the person-to-person distance is not less than a predetermined value is output.
Here, an example of output of face region information which satisfies the condition that the person-to-person distance is not less than the threshold will be explained. A process of tracking face region information backward in time is repeated until the distance Dm between face regions decreases below the preset threshold Th2 (steps S25 and S26). The distance Dm between face regions is defined as shown in
where Dmij=|mi−mj|, n is the number of persons detected, and mi is the center of gravity of region i. The distance Dm between face regions represents the distance between the centers of gravity of face regions detected. In
The tracking process is terminated when the distance Dm between face regions has decreased below Th2 and then face region information tracked up to this point is output to the face feature extraction unit 105 (step S27).
According to the third embodiment, as described above, when two or more persons are present at the time of entry of dictionary information, an image used for entry is selected according to the distance between their face regions, which allows only an image that can identify the person himself or herself with certainty to be used for entry. That is, only appropriate persons can be entered into the dictionary unit.
Thus, the third embodiment allows images captured in the past to be entered. Thereby, even if, when a person passes through the gate or after he or she has passed through the gate (the time when shooting terminates), it becomes clear that he or she is an unauthorized one, image entry of that unauthorized person becomes possible.
Next, a fourth embodiment of the present invention will be described.
The configuration of an entrance/exit management system to which a face image read apparatus according to the fourth embodiment is applied remains basically unchanged from that of the first embodiment (
The display unit 108 displays various items of information on the basis of display control information from the authentication controller 112 as described previously. When two or more candidates for entry are present, the display unit 108 displays to the person in charge information that allows visual confirmation. Specifically, as shown in
According to the fourth embodiment, as described above, a face detecting image at the time of entry and an image captured with a larger field of view are displayed synchronously with each other and side by side, thus allowing visual confirmation to be carried out with ease.
Although the embodiments have been described in terms of an example of entering new dictionary information (face feature information) into a face collation dictionary, the principles of the invention is equally applicable to replacement of dictionary information already entered into the face collation dictionary with new dictionary information.
Additional advantages and modifications will readily occur to those skilled in the art. Therefore, the invention in its broader aspects is not limited to the specific details and representative embodiments shown and described herein. Accordingly, various modifications may be made without departing from the spirit or scope of the general inventive concept as defined by the appended claims and their equivalents.
Number | Date | Country | Kind |
---|---|---|---|
2006-100715 | Mar 2006 | JP | national |