The present disclosure generally relates to systems and methods for accurately performing facial alignment for facial feature detection.
Accurate detection of facial landmark facial features is important for such applications as virtual application of makeup effects to facial features including the eyes, lips, cheeks, and so on. Although model-based facial alignment algorithms exist that rely on databases of pre-defined facial models, one perceived shortcoming with such algorithms is the finite number of models. Therefore, there is a need for an improved method for tracking facial features.
In accordance with one embodiment, a computing device obtains a digital image depicting a facial region of an individual and performs a facial alignment algorithm to generate a facial alignment result depicted in a digital image to identify landmark facial features in the facial region. The computing device performs a facial recognition algorithm on the facial region to determine whether the facial region matches a facial feature definition previously-stored in a data store. The computing device generates descriptor data comprising an image patch within a region of interest and identifies a closest matching facial feature definition using the descriptor data. The computing device modifies a landmark facial feature based on the identified closest matching facial feature definition.
Another embodiment is a system that comprises a memory storing instructions and a processor coupled to the memory. The processor is configured by the instructions to obtain a digital image depicting a facial region of an individual and perform a facial alignment algorithm to generate a facial alignment result depicted in a digital image to identify landmark facial features in the facial region. The processor is further configured to perform a facial recognition algorithm on the facial region to determine whether the facial region matches a facial feature definition previously-stored in a data store. The processor is further configured to generate descriptor data comprising an image patch within a region of interest and identify a closest matching facial feature definition using the descriptor data. The processor is further configured to modify a landmark facial feature based on the identified closest matching facial feature definition.
Another embodiment is a non-transitory computer-readable storage medium storing instructions to be implemented by a computing device having a processor, wherein the instructions, when executed by the processor, cause the computing device to obtain a digital image depicting a facial region of an individual and perform a facial alignment algorithm to generate a facial alignment result depicted in a digital image to identify landmark facial features in the facial region. The processor is further configured to perform a facial recognition algorithm on the facial region to determine whether the facial region matches a facial feature definition previously-stored in a data store. The processor is further configured to generate descriptor data comprising an image patch within a region of interest and identify a closest matching facial feature definition using the descriptor data. The processor is further configured to modify a landmark facial feature based on the identified closest matching facial feature definition.
Other systems, methods, features, and advantages of the present disclosure will be or become apparent to one with skill in the art upon examination of the following drawings and detailed description. It is intended that all such additional systems, methods, features, and advantages be included within this description, be within the scope of the present disclosure, and be protected by the accompanying claims.
Various aspects of the disclosure can be better understood with reference to the following drawings. The components in the drawings are not necessarily to scale, with emphasis instead being placed upon clearly illustrating the principles of the present disclosure. Moreover, in the drawings, like reference numerals designate corresponding parts throughout the several views.
Various embodiments are disclosed for accurately detecting facial features by applying facial alignment and facial recognition techniques that utilize historical descriptor data. A description of a system for performing facial feature detection is now described followed by a discussion of the operation of the components within the system.
A facial feature locator 104 executes on a processor of the computing device 102 and includes a feature estimator 106 and a refinement module 108. The feature estimator 106 is configured to obtain a digital image depicting a facial region of an individual. As one of ordinary skill will appreciate, the digital image may be encoded in any of a number of formats including, but not limited to, JPEG (Joint Photographic Experts Group) files, TIFF (Tagged Image File Format) files, PNG (Portable Network Graphics) files, GIF (Graphics Interchange Format) files, BMP (bitmap) files or any number of other digital formats.
Alternatively, the digital image may be derived from a still image of a video encoded in formats including, but not limited to, Motion Picture Experts Group (MPEG)-1, MPEG-2, MPEG-4, H.264, Third Generation Partnership Project (3GPP), 3GPP-2, Standard-Definition Video (SD-Video), High-Definition Video (HD-Video), Digital Versatile Disc (DVD) multimedia, Video Compact Disc (VCD) multimedia, High-Definition Digital Versatile Disc (HD-DVD) multimedia, Digital Television Video/High-definition Digital Television (DTV/HDTV) multimedia, Audio Video Interleave (AVI), Digital Video (DV), QuickTime (QT) file, Windows Media Video (WMV), Advanced System Format (ASF), Real Media (RM), Flash Media (FLV), an MPEG Audio Layer III (MP3), an MPEG Audio Layer II (MP2), Waveform Audio Format (WAV), Windows Media Audio (WMA), 360 degree video, 3D scan model, or any number of other digital formats. The feature estimator 106 is further configured to perform facial alignment on the digital image and generate a result file comprising locations of facial landmark facial features in the facial region.
The refinement module 108 is configured to compare the facial feature definition generated from the result file to facial feature definitions 118 stored in a data store 116, where each of the facial feature definitions 118 comprises locations of facial landmark facial features of corresponding facial regions and refinement data for one or more of the locations of facial landmark facial features. In the context of the present disclosure, such refinement data reflects adjustments made to initial estimated locations of facial landmark facial features, wherein such adjustments were previously made to another digital image depicting the same facial region. Such historical refinement data is utilized by the computing device 102 to automatically perform adjustments to the locations of facial landmark facial features of a current digital image depicting the same facial region.
The refinement module 108 then performs various functions depending on whether the facial feature definition generated from the result file matches one of the facial feature definitions 118 in the data store 116. For example, if the facial feature definition generated from the result file matches one of the facial feature definitions 118, the refinement module 108 retrieves the matching facial feature definition 118 from the data store 116 and applies the refinement data contained in the matching facial feature definition 118 to a corresponding location of a facial landmark feature in the current digital image to generate a refined result file. The refined result file therefore contains a refined location for one or more facial landmark facial features.
If no further refinement is needed for locations of any of the facial landmark facial features in the digital image, the refinement module 108 outputs the refined result file. If the facial feature definition generated from the result file does not match any of the facial feature definitions 118, the refinement module 108 determines that the facial region depicted in the current digital image is a new facial region. If necessary, the user adjusts the location of landmark facial features in the current digital image and stores the facial feature definition generated from the result file as a new facial feature definition 118 in the data store 116 for future use (as described in connection with block 630 in
The processing device 202 may include any custom made or commercially available processor, a central processing unit (CPU) or an auxiliary processor among several processors associated with the computing device 102, a semiconductor based microprocessor (in the form of a microchip), a macroprocessor, one or more application specific integrated circuits (ASICs), a plurality of suitably configured digital logic gates, and other well known electrical configurations comprising discrete elements both individually and in various combinations to coordinate the overall operation of the computing system.
The memory 214 may include any one of a combination of volatile memory elements (e.g., random-access memory (RAM, such as DRAM, and SRAM, etc.)) and nonvolatile memory elements (e.g., ROM, hard drive, tape, CDROM, etc.). The memory 214 typically comprises a native operating system 216, one or more native applications, emulation systems, or emulated applications for any of a variety of operating systems and/or emulated hardware platforms, emulated operating systems, etc. For example, the applications may include application specific software which may comprise some or all the components of the computing device 102 depicted in
Input/output interfaces 204 provide any number of interfaces for the input and output of data. For example, where the computing device 102 comprises a personal computer, these components may interface with one or more input/output interfaces 204, which may comprise a keyboard or a mouse, as shown in
In the context of this disclosure, a non-transitory computer-readable medium stores programs for use by or in connection with an instruction execution system, apparatus, or device. More specific examples of a computer-readable medium may include by way of example and without limitation: a portable computer diskette, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM, EEPROM, or Flash memory), and a portable compact disc read-only memory (CDROM) (optical).
Reference is made to
Although the flowchart 300 of
At block 310, the computing device 102 obtains a digital image depicting a facial region of an individual. At block 320, the computing device 102 performs facial alignment on the digital image and generates a result file comprising initial estimated locations of facial landmark facial features in the facial region. At block 330, the computing device 102 compares the facial feature definition generated from the result file to facial feature definitions 118 in a data store 116 to determine whether the facial region depicted in the current digital image matches a facial region previously processed by the computing device 102.
At decision block 340, the computing device 102 determines whether the facial feature definition generated from the result file matches one of the facial feature definitions 118 in the data store 116. If a match is found, then at block 350, the computing device 102 accesses the matching facial feature definition 118 in the data store 116 and performs automatic refinement of location(s) of facial features in the current digital image. Specifically, responsive to the facial feature definition generated from the result file matching one of the facial feature definitions 118, the computing device 102 accesses the matching facial feature definition 118 and applies the corresponding refinement data to a corresponding location of a facial landmark feature in the digital image to generate a refined result file with a refined location for one or more facial landmark facial features.
For some embodiments, the refinement data comprises descriptor data, wherein the descriptor data may comprise scale-invariant feature transform (SIFT) data, histogram of oriented gradients (HOG) data, or Haar-like feature data. For some embodiments, the computing device 102 compares the facial feature definition generated from the result file to facial feature definitions 118 in the data store 116 by comparing descriptor data of the facial feature definition generated from the result file with descriptor data of each of the facial feature definitions in the data store. If no further refinement is needed for locations of any of the facial landmark facial features in the digital image (decision block 360), the computing device 102 outputs the refined result file.
On the other hand, if a match is found but further refinement is needed for locations of one or more of the facial landmark facial features in the digital image, the computing device 102 obtains further refinement of one or more locations of facial landmark facial features, adjusts the one or more locations, and stores descriptors corresponding to the facial landmark facial features with further refined locations in the data store 116. In particular, in block 370, the descriptors associated with the refined locations are stored in the matching facial feature definition 118 identified earlier by the computing device 102. The computing device 102 may obtain further refinement of one or more locations of facial landmark facial features by tracking manual adjustments performed by a user to the locations of facial landmark facial features.
Referring back to decision block 340, if no match is found, then at decision block 360, the computing device 102 determines whether further refinement of any of the facial features locations is needed. If further refinement is needed, then at block 370, the computing device 102 performs further refinement of the location(s) of the facial features and stores the corresponding descriptors for the refined location(s) in the data store 116. If no further refinement is needed, then at block 380, the computing device 102 outputs the result file, which contains the locations of facial landmark facial features in the current digital image. Referring back to decision block 340, if no match was found earlier, the facial feature definition generated from the result file is stored as a new facial feature definition 118 in the data store 116. On the other hand, if a match was found earlier, the result file is stored as part of the matching facial feature definition 118. Thereafter, the process in
Having described the basic framework of a system for performing facial feature detection, reference is made to
If a match is found between the facial feature definition generated from the result file and a facial feature definition 118 in the data store 116, the computing device 102 retrieves the matching facial feature definition 118 and accesses any refinement data corresponding to the facial feature definition 118. Such refinement data reflects previous adjustments made to one or more locations of landmark facial features 406. The computing device 102 then applies such refinement data to the locations of the landmark facial features 406 in the current digital image 402 (
If no match is found, the computing device 102 determines that a new facial region 404 (
Reference is made to
It is understood that the flowchart 600 of
Although the flowchart 600 of
In block 610, the computing device 102 obtains a digital image. In block 620, the computing device 102 performs facial alignment on the digital image and generates a facial alignment result file, which defines the location of landmark facial features represented by points 808 in
In the example shown in
Reference is made to
Although the flowchart 700 of
In block 710, the computing device 102 obtains another digital image. In block 720, the computing device 102 performs facial alignment on the digital image and generates a facial alignment result file, which defines the location of landmark facial features represented by points 808 in
For some embodiments, the facial feature definition previously-stored in the data store 116 was generated based on user adjustments made to a result file comprising locations of facial landmark facial features in a facial region corresponding to the facial feature definition, where the locations of the user adjustments were stored to generate the image patch as descriptor data in the facial feature definition. For some embodiments, the descriptor data further comprises scale-invariant feature transform (SIFT) data, histogram of oriented gradients (HOG) data, or Haar-like feature data. For some embodiments, image patches around each location of the user adjustments are stored with the facial feature definition, where each image patch comprises a region of a predetermined size.
At decision block 740, if the face depicted in the digital image 802 already exists in the data store 116, the computing device 102 identifies a region of interest 812 (
For some embodiments, the region of interest 812 is defined based on locations of the identified landmark facial features, where the image patch comprises a region of a predetermined size around suggested landmark facial features within the region of interest. For some embodiments, the closest matching facial region definition is identified using the descriptor data in response to the facial region matching a facial feature definition 118 previously-stored in the data store 116.
In block 760, the computing device 102 finds the closest matching facial feature definition 118 (
It should be emphasized that the above-described embodiments of the present disclosure are merely possible examples of implementations set forth for a clear understanding of the principles of the disclosure. Many variations and modifications may be made to the above-described embodiment(s) without departing substantially from the spirit and principles of the disclosure. All such modifications and variations are intended to be included herein within the scope of this disclosure and protected by the following claims.
This application claims priority to, and the benefit of, U.S. Provisional Patent Application entitled, “Systems and Methods for Facial Alignment,” having Ser. No. 62/670,118, filed on May 11, 2018, which is incorporated by reference in its entirety.
Number | Date | Country | |
---|---|---|---|
62670118 | May 2018 | US |