The present invention relates to a facial recognition technology, and in particular to a method for collecting facial recognition data and a system for collecting facial recognition data.
Existing facial recognition systems are generally based on images collected when a user register recognition data for facial feature extraction and facial recognition training. Due to the limited number of images that can be input when registering the recognition data and limited face orientations that can be taken, such facial recognition systems are actually built on limited database for recognition operations. For acquiring facial recognition data in a variety of face orientations, many requirements and constraints need to be considered and dealt with when setting up cameras. It would cause problems in setting up the system.
Therefore, the present invention provides method and system for collecting facial recognition data, in which facial recognition data are added progressively during a facial recognition process, thereby effectively ameliorating problems caused by insufficient database.
In an aspect of the present invention, a method for collecting facial recognition data from an image stream comprises: locating a first face area of a person to be recognized from an Nth image frame of the first image stream; extracting a first facial feature from the first face area, wherein the first facial feature is defined with S factors; acquiring a second facial feature extracted from a second face area shown in an (N−1)th image frame of the first image stream at a position corresponding to the first face area, wherein the (N−1)th image frame is generated prior to the Nth image frame; executing a first determining step to determine whether the first face area is relevant to the second face area or not, and assigning to the first face area a tracing code, whose number varies with a result of the determining step; executing a second determining step to determine whether to store the first facial feature according to a similarity level of the first facial feature to at least one existent facial feature data; storing the first facial feature and inputting the first facial feature into a first neural network to generate an adjusted first facial feature if the similarity level of the first facial feature to at least one existent facial feature data is determined to be higher than or equal to a preset level, wherein the adjusted first facial feature is defined with T factors, where T is not smaller than S; acquiring at least one adjusted existent facial feature data, each defined with the T factors and previously generated by inputting the at least one existent facial feature data defined with the S factors into the first neural network; executing a third determining step to determine whether the person to be recognized is a registered person or not according to a similarity level of the adjusted first facial feature to each of the at least one adjusted existent facial feature data; and using the stored first facial feature as a material for training the first neural network if the person to be recognized is determined to be a registered person.
In another aspect of the present invention, a system for collecting facial recognition data comprises: a first image pickup device installed at a first position for picking up images from a first shooting region to generate a first image stream; a facial feature database containing therein at least one existent facial feature data, each of which corresponds to a registered person; and a processing device electrically connected to the first image pickup device and the facial feature database. The processing device executes the method for collecting facial recognition data as described above.
In further aspect of the present invention, a method for collecting facial recognition data from a first image stream and a second image stream, wherein the first image stream and the second image stream respectively include a Kth image frame and a Jth image frame simultaneously picked up from respective shooting regions with an overlapping zone, comprises: locating a first face area of a person to be recognized from the Kth image frame of the first image stream; extracting a first facial feature from the first face area; determining whether a second face area exists in the Jth image frame of the second image stream at the same position where the first face area is disposed in the Kth image frame of the first image stream; acquiring a second facial feature previously extracted from the second face area and stored in a facial feature database; and correlating the second facial feature to the person to be recognized.
In accordance with yet another aspect of the present invention, a system for collecting facial recognition data comprises: a first image pickup device installed at a first position for picking up images from a first shooting region to generate a first image stream; a second image pickup device installed at a second position for picking up images from a second shooting region to generate a second image stream, wherein the second shooting region has an overlapping zone with the first shooting region; a facial feature database containing therein at least one existent facial feature data, each of which corresponds to a registered person; and a processing device electrically connected to the first image pickup device, the second image pickup device and the facial feature database. The processing device executes the method for collecting facial recognition data as described above.
The above contents of the present invention will become more readily apparent to those ordinarily skilled in the art after reviewing the following detailed description and accompanying drawings, in which:
The present invention will now be described more specifically with reference to the following embodiments. It is to be noted that the following descriptions of preferred embodiments of this invention are presented herein for purpose of illustration and description only; it is not intended to be exhaustive or to be limited to the precise form disclosed.
As shown in
Hereinafter, processing of the image stream S1 is taken as an example to illustrate the details of the technology, and other image streams, e.g. image stream S2 or image stream S3, can be processed in similar ways. In order to make it easy for those skilled in the art to understand the technology provided herein, the following descriptions will be made with reference to
As shown in
Hereinafter, processing of the Nth image frame of the image stream S1 generated by the camera 120 is taken as an example to illustrate the process with reference to
Subsequently, the processing device 14 extracts a set of first facial features from the Nth image frame (Step S304). The set of first facial features include facial features respectively found in all the first face area while processing the Nth image frame. Following Step S304, the processing device 14 retrieves a set of second facial features from the facial feature database 16 (Step S306). The second facial features are previously extracted from all face area, which are defined as second face area, shown in the (N−1)th image frame and stored in the facial feature database 16. The processing device 14 then determines whether a first face area and its corresponding second face area are correlated to each (Step S308), for example, in view of spatial correlation. The determination of spatial correlation can be executed by way of many currently available algorithms. For example, the spatial correlation can be determined according to an overlapping ratio between the first face area and the second face area or a distance from a center of gravity of the first face area to a center of gravity of the second face area, or by executing the Kalman Filter algorithm. In addition, there are still a variety of prior art available to implement determination of spatial correlation between two area and known by those skilled in the art, so it is not to be redundantly described herein.
Subsequently, the processing device 14 processes the first face area differentially based on the correlation determined in Step S308. If the correlation shows that the first face area is relevant to the second face area, the first face area is assigned with a first tracing-code number (Step S310). Otherwise, the first face area is assigned with a second tracing-code number (Step S312). In practice, each face area corresponds to an exclusive tracing-code number, and the tracing-code numbers corresponding to different face area in the same image frame are hypothetically set to be different from one another.
The above Steps S308, S310 and S312 will be described in more detail with reference to
Step S410 includes a sub-step of determining whether the similarity of the first facial feature of the first face area to the second facial feature of the corresponding second face area is higher than or equal to a preset first level (Step S420). If positive, it is indicated that the first face area and the second face area are likely to belong to the same person, and thus the tracing code of the first face area remains to be the first tracing-code number (Step S422). In other words, the same number as that previously assigned to the corresponding second face area can be used as the tracing code of the first face area. On the other hand, if it is determined that the similarity of the first facial feature of the first face area to the second facial feature of the corresponding second face area is lower than the preset first level in Step S420, it is indicated that the first face area and the second face area belong to different persons, and the brand-new number is used as the tracing code of the first face area (Step S424).
It is to be noted that in the flowchart of
Referring back to
On the contrary, if the similarity of the first facial feature to a specified one of the facial features stored in the database 16 is higher than or equal to the preset second level, it is indicated that the person owning the first facial feature might be a registered one owning the specified facial feature. Therefore, in Step S318, the processing device 14 records the first facial feature and correlates the first facial feature to the tracing code determined in the Step S310 or S312. In addition to storing the first facial feature, the processing device 14 further inputs the first facial feature to a specifically designed first neural network, by which the first facial feature is derived into an adjusted first facial feature (Step S320). Meanwhile, other adjusted facial feature data previously generated by the first neural network are collected (Step S322). In this embodiment, the first neural network transforms an input involving a first number S of factors into an output involving a second number T of factors, wherein T is not smaller than S. In other words, if a facial feature, e.g. the first facial feature, contains S factors and is inputted into the first neural network, the first neural network would transform the facial feature into an adjusted one containing T factors.
Since the factor number T of the output is greater than or equal to the factor number S of the input, i.e. the factor number T of the output is not smaller than the factor number S of the input, the facial features belonging to two different persons would become more distinctive after being adjusted by the first neural network. In more detail, due to the transformation conducted by the first neural network, the spatial distance between the two facial features in a feature coordinate is enlarged. In other words, the first neural network functions to gather the facial features belonging to the same person in a specified region in the feature coordinate, and pull away the facial features belonging to different persons. As a result, the probability of highly correlation of a facial feature to two or more facial features can be reduced, and the recognition accuracy can be improved.
For example, the first neural network may be established according to Adaptor Topology and Re-constructor Topology and grows with machine learning. By way of Adaptor Topology, the data containing S factors is transformed into data containing T factors by projecting each point in the S-dimensional coordinate space onto a corresponding point in the T-dimensional coordinate space. In this way, the points associated with the same person can be gathered while the points associated with different persons can be pulled way. By way of the Re-constructor Topology, the operation is reversed. That is, the data containing T factors is transformed into data containing S factors by projecting each point in the T-dimensional coordinate space onto a corresponding point in the S-dimensional coordinate space. Meanwhile, the points in the T-dimensional coordinate space and the points in the S-dimensional coordinate space are kept in a one-to-one relationship, and distinctive factors of the data may not be lost due to the transformation. The details of the transformation described above can be realized by those skilled in the art, so it is not to be redundantly described herein.
After the adjusted first facial feature is obtained and the adjusted facial feature data previously generated by the first neural network are collected in Steps S320 and S322, the processing device 14 determines whether the person owning the first facial feature is a registered one in the database 16 or not according to the similarity of the adjusted first facial feature to the collected adjusted facial feature data (Step S324). If the similarity of the adjusted first facial feature to each of the collected adjusted facial feature data is lower than a preset third level, the processing device 14 determines that the person owning the adjusted first facial feature is not a registered one in the database 16. The flow proceeds to Step S316 to end processing of the first face area. On the other hand, if the similarity of the adjusted first facial feature to a specified one of the collected adjusted facial feature data is higher than or equal to the preset third level, the processing device 14 determines that the person owning the adjusted first facial feature is the one who has registered the specified adjusted facial feature (Step S326). Preferably, the first facial feature stored after Step S318 may be correlated to the person who has registered the specified adjusted facial feature so as to be included in the database 16. Then the first facial feature, or the image of the first face area in connection with the first facial feature, can be used as a material taking part in training the first neural network.
For improving accuracy of face recognition, the preset third level is preferably higher than the preset second level.
In view of the foregoing, it is understood that according to the present invention, correlations between successive image frames are referred to screen out facial areas that might belong to any previously registered person or persons. Afterwards, a series of recognition steps are executed to reconfirm the accuracy of the judgement. Once a person to be recognized is confirmed to be a registered one, the newly collected images or features in connection with the specified face area will be added to the database to serve as material for subsequent training of the neural network. It is expected that efficiency and accuracy of subsequent recognition can be improved by way of such a machine learning process.
In the above embodiment, an image stream is independently processed. Alternatively, two or more image streams may be referred to one another so that more training material can be obtained. For example, two or more of the image streams S1, S2 and S3 generated by different camera devices 120, 122 and 124 as shown in
After specifying a registered person from the database 16 by processing one of the image streams as described above, the processing device 14 processes the other image streams to confirm the recognition result. For example, as shown in
For example, when a Jth image frame of the image stream S1 and a Kth image frame of the image stream S2 are generated at the same time point, the processing device 14 will not process the Kth image frame of the image stream S2 until finishing processing all the face area simultaneously appearing in the Jth image frame of the image stream S1. After finishing processing all the face area simultaneously appearing in the Jth image frame of the image stream S1, the processing device 14 initially relates a face area appearing at a specific position of the Jth image frame to a specified person. If the specified person is a registered one in the database, the processing device 14 collects correlation data of the specific position to the specified person (Step S500).
Subsequently, the processing device 14 processes the image stream S2. In the Kth image frame generated at the same time as the Jth image frame, a position identically corresponding to the specific position in the Jth image frame is first located by referring to environmental parameters as mentioned above (Step S502). Whether a face area exists at the above-mentioned corresponding position or not is determined (Step S504). If such a face area does exist in the Kth image frame, the face area and facial features in connection with the corresponding position are correlated to the specified person (Step S506). Otherwise, the flow proceeds back to Step S500 to process another face area appearing at another specific position in the Kth image frame.
By way of the method illustrated in
It is understood from the above descriptions that according to the present invention, the method and system for collecting facial recognition data as described above locates one or more facial area belonging to the same person from successive image frames and collect the facial area data for analysis to identify that person. Then the images or facial features in connection with the one or more facial area of that person are collected as training material for facial recognition. Therefore, the method and system for collecting facial recognition data according to the present invention can gradually add facial recognition data into the database during a normal facial recognition process, effectively ameliorating problems caused by insufficient database.
While the invention has been described in terms of what is presently considered to be the most practical and preferred embodiments, it is to be understood that the invention needs not to be limited to the disclosed embodiment. On the contrary, it is intended to cover various modifications and similar arrangements included within the spirit and scope of the appended claims which are to be accorded with the broadest interpretation so as to encompass all such modifications and similar structures.
Number | Date | Country | Kind |
---|---|---|---|
109121029 | Jun 2020 | TW | national |