The present disclosure generally relates to systems and methods for performing automatic eye gaze refinement when taking selfie photos.
With the proliferation of smartphones, people commonly capture selfie photos. Although most smartphones are equipped with front-facing cameras where users are able to view themselves on the display when taking a selfie, the user's eyes may be fixed on the user's face shown in the display rather than on the front-facing camera. Also, the user's eyes may be inadvertently half opened or completely closed while taking a selfie. Thus, in many instances, the user is not gazing in the direction of the front-facing camera, thereby resulting in selfies where the user's eyes are not centered on the front-facing camera.
In accordance with one embodiment, a computing device having a front-facing camera applies facial landmark detection and identifies eye regions in the digital image responsive to the front-facing camera capturing a digital image of an individual. For at least one of the eye regions, the computing device is further configured to extract attributes of the eye region, determine an eye gaze score based on the extracted attributes, generate a modified eye region based on the eye gaze score, and output a modified digital image with the modified eye region
Another embodiment is a system that comprises a front-facing camera, a memory storing instructions, and a processor coupled to the memory. The processor is configured to apply facial landmark detection and identify eye regions in the digital image responsive to the front-facing camera capturing a digital image of an individual. For at least one of the eye regions, the processor is further configured to extract attributes of the eye region, determine an eye gaze score based on the extracted attributes, generate a modified eye region based on the eye gaze score, and output a modified digital image with the modified eye region.
Another embodiment is a non-transitory computer-readable storage medium storing instructions to be implemented by a computing device having a front-facing camera and a processor, wherein the instructions, when executed by the processor, cause the processor apply facial landmark detection and identify eye regions in the digital image responsive to the front-facing camera capturing a digital image of an individual. For at least one of the eye regions, the processor is further configured to extract attributes of the eye region, determine an eye gaze score based on the extracted attributes, generate a modified eye region based on the eye gaze score, and output a modified digital image with the modified eye region.
Other systems, methods, features, and advantages of the present disclosure will be or become apparent to one with skill in the art upon examination of the following drawings and detailed description. It is intended that all such additional systems, methods, features, and advantages be included within this description, be within the scope of the present disclosure, and be protected by the accompanying claims.
Various aspects of the disclosure can be better understood with reference to the following drawings. The components in the drawings are not necessarily to scale, with emphasis instead being placed upon clearly illustrating the principles of the present disclosure. Moreover, in the drawings, like reference numerals designate corresponding parts throughout the several views.
Although users are typically able to view themselves on the display of a smartphone when taking a selfie using a front-facing camera, the user's eyes may be fixed on the user's face shown in the display rather than on the front-facing camera. Also, the user's eyes may be inadvertently half opened or completely closed while taking a selfie. This may occur, for example, if the user is outside facing the sun when taking a selfie. Thus, many times the user is not gazing in the direction of the front-facing camera, thereby resulting in selfies where the user's eyes are not centered on the front-facing camera. Various embodiments are disclosed for performing automatic eye gaze refinement for selfie images.
A description of a system for performing automatic eye gaze refinement for selfie images is now described followed by a discussion of the operation of the components within the system.
A selfie application 104 executes on a processor of the computing device 102 and includes a facial feature extractor 106, an eye gaze analyzer 108, an eye region modifier 110, and an image editor 112. The facial feature extractor 106 is configured to determine whether a front-facing camera captures a digital image of an individual (i.e., a selfie). Responsive to the front-facing camera capturing a digital image of an individual, the facial feature extractor 106 applies facial landmark detection and identifies both the left and right eye regions in the digital image of the individual.
As one of ordinary skill will appreciate, the digital image may be encoded in any of a number of formats including, but not limited to, JPEG (Joint Photographic Experts Group) files, TIFF (Tagged Image File Format) files, PNG (Portable Network Graphics) files, GIF (Graphics Interchange Format) files, BMP (bitmap) files or any number of other digital formats. Alternatively, the digital image may be derived from a still image of a video encoded in formats including, but not limited to, Motion Picture Experts Group (MPEG)-1, MPEG-2, MPEG-4, H.264, Third Generation Partnership Project (3GPP), 3GPP-2, Standard-Definition Video (SD-Video), High-Definition Video (HD-Video), Digital Versatile Disc (DVD) multimedia, Video Compact Disc (VCD) multimedia, High-Definition Digital Versatile Disc (HD-DVD) multimedia, Digital Television Video/High-definition Digital Television (DTV/HDTV) multimedia, Audio Video Interleave (AVI), Digital Video (DV), QuickTime (QT) file, Windows Media Video (WMV), Advanced System Format (ASF), Real Media (RM), Flash Media (FLV), an MPEG Audio Layer III (MP3), an MPEG Audio Layer II (MP2), Waveform Audio Format (WAV), Windows Media Audio (WMA), 360 degree video, 3D scan model, or any number of other digital formats.
The eye gaze analyzer 108 is configured to extract attributes of one of the eye regions (either the left eye region or the right eye region) and determine an eye gaze score based on the extracted attributes. For some embodiments, the eye gaze analyzer 108 randomly selects either the left eye region or the right eye region to analyze first, where the other eye region is later analyzed.
If the eye gaze analyzer 108 determines that the first randomly selected eye region does not need to be modified based on the eye gaze score, the eye gaze analyzer 108 then analyzes the other eye region to determine whether any modifications are needed. Note that in some instances, it may be necessary for only one eye region to undergo modification whereas in other instances, it may be necessary for both eye regions to be modified. If both eyes are gazing in the direction of the front-facing camera, then no modification is performed.
For some embodiments, the eye gaze analyzer 108 extracts attributes of the eye region by identifying a lower eyelid and an upper eyelid for each eye in the selfie image. The eye gaze analyzer 108 then generates a first coefficient based on a curvature of the lower eyelid. The eye gaze analyzer 108 also generates a second coefficient based on a curvature of the upper eyelid.
The eye gaze analyzer 108 then generates an eye gaze score based on the first and second coefficients. For some embodiments, the eye gaze analyzer 108 determines the eye gaze score based on the first and second coefficients by retrieving weight values 118 from a data store 116 and generating weighted first and second coefficients. The eye gaze analyzer 108 then determines the eye gaze score based on the weighted first and second coefficients, as described in more detail below.
The eye region modifier 110 is configured to generate a modified eye region based on the eye gaze score. Once the first eye region is modified, the eye region modifier 110 is further configured to automatically determine whether to modify the other eye region. For example, if the eye region modifier 110 first modifies the left eye region, the steps described above may be repeated for the other eye region (the right eye region) such that both the left and right eye regions are modified as needed. The image editor 112 is configured to output a modified digital image depicting each of the modified eye regions, where the modified digital image depicts both of the individual's eyes gazing in the direction of the front-facing camera.
The processing device 202 may include any custom made or commercially available processor, a central processing unit (CPU) or an auxiliary processor among several processors associated with the computing device 102, a semiconductor based microprocessor (in the form of a microchip), a macroprocessor, one or more application specific integrated circuits (ASICs), a plurality of suitably configured digital logic gates, and other well known electrical configurations comprising discrete elements both individually and in various combinations to coordinate the overall operation of the computing system.
The memory 214 may include any one of a combination of volatile memory elements (e.g., random-access memory (RAM, such as DRAM, and SRAM, etc.)) and nonvolatile memory elements (e.g., ROM, hard drive, tape, CDROM, etc.). The memory 214 typically comprises a native operating system 216, one or more native applications, emulation systems, or emulated applications for any of a variety of operating systems and/or emulated hardware platforms, emulated operating systems, etc. For example, the applications may include application specific software which may comprise some or all the components of the computing device 102 depicted in
Input/output interfaces 204 provide any number of interfaces for the input and output of data. For example, where the computing device 102 comprises a personal computer, these components may interface with one or more user input/output interfaces 204, which may comprise a keyboard or a mouse, as shown in
In the context of this disclosure, a non-transitory computer-readable medium stores programs for use by or in connection with an instruction execution system, apparatus, or device. More specific examples of a computer-readable medium may include by way of example and without limitation: a portable computer diskette, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM, EEPROM, or Flash memory), and a portable compact disc read-only memory (CDROM) (optical).
Reference is made to
Although the flowchart 300 of
At block 310, the computing device 102 determines whether a front-facing camera captures a digital image of an individual. Responsive to the front-facing camera capturing a digital image of an individual, the computing device 102 performs facial landmark detection and identifies an eye region from the digital image of the individual. The operations described in connection with blocks 320 to 350 below are performed for at least one of the eye regions
At block 320, the computing device 102 extracts attributes of the eye region. At block 330, the computing device 102 determines an eye gaze score based on the extracted attributes. For some embodiments, the computing device 102 determines the eye gaze score based on the extracted attributes by identifying a lower eyelid and an upper eyelid. The computing device 102 then generates a first coefficient based on a curvature of the lower eyelid. The computing device 102 also generates a second coefficient based on a curvature of the upper eyelid. The computing device 102 then determines the eye gaze score based on the first and second coefficients.
For some embodiments, the computing device 102 determines the eye gaze score based on the first and second coefficients by retrieving weight values 118 (
At block 340, the computing device 102 generates a modified eye region based on the eye gaze score. For some embodiments, the computing device 102 generates the modified eye region based on the eye gaze score by determining a direction of an eye gaze with respect to the front-facing camera based on the eye gaze score and warping a pupil in the eye region to modify the direction of the eye gaze while maintaining an original curvature of the pupil. At block 350, the computing device 102 outputs a modified digital image with the modified eye region. Thereafter, the process in
Reference is made to
As shown in
If the eye gaze algorithm determines that the relationship αx>y applies to the current digital image, the eye gaze algorithm assigns a value of 1. As shown in
If the eye gaze algorithm determines that the relationship βx<y applies to the digital image, the eye gaze algorithm assigns a value of 2. As shown in
If the eye gaze algorithm determines that τX>y and α>τ apply to the digital image, the eye gaze algorithm assigns a value of 4. As shown in
It should be emphasized that the above-described embodiments of the present disclosure are merely possible examples of implementations set forth for a clear understanding of the principles of the disclosure. Many variations and modifications may be made to the above-described embodiment(s) without departing substantially from the spirit and principles of the disclosure. All such modifications and variations are intended to be included herein within the scope of this disclosure and protected by the following claims.
This application claims priority to, and the benefit of, U.S. Provisional Patent Application entitled, “Automatic Method for Eye Gaze Refinement,” having Ser. No. 62/849,999, filed on May 20, 2019, which is incorporated by reference in its entirety.
Number | Name | Date | Kind |
---|---|---|---|
6771303 | Zhang et al. | Aug 2004 | B2 |
6806898 | Toyama et al. | Oct 2004 | B1 |
8670019 | Byers | Mar 2014 | B2 |
8836777 | Mehra | Sep 2014 | B2 |
8908008 | Tan et al. | Dec 2014 | B2 |
8957943 | Nourbakhsh | Feb 2015 | B2 |
9141875 | Wolf et al. | Sep 2015 | B2 |
9288388 | Son et al. | Mar 2016 | B2 |
9300916 | Breedvelt et al. | Mar 2016 | B1 |
9684953 | Kuster et al. | Jun 2017 | B2 |
9740938 | Nilsson et al. | Aug 2017 | B2 |
20130222644 | Son | Aug 2013 | A1 |
20170195662 | Sommerlade et al. | Jul 2017 | A1 |
20170308734 | Chalom et al. | Oct 2017 | A1 |
20190244274 | Chang | Aug 2019 | A1 |
Number | Date | Country |
---|---|---|
103345619 | Oct 2013 | CN |
2016101505 | Jun 2016 | WO |
Entry |
---|
Wood, E., et al.; “GazeDirector: Fully Articulated Eye Gaze Redirection in Video;” https://arxiv.org/pdf/1704.08763.pdf; Apr. 2017; pp. 1-10. |
Ganin, Y., et al.; “DeepWarp: Photorealistic Image Resynthesis for Gaze Manipulation;” Skoltech; http://www.eccv2016.org/files/posters/P-1B-24.pdf. |
Choi, J., et al.; “Human Attention Estimation for Natural Images: An Automatic Gaze Refinement Approach;” IEEE; Jan. 2016; pp. 1-12. |
Regenbrecht, H , et al.; “Mutual Gaze Support in Videoconferencing Reviewed;” Communications of the Association for Information Systems; vol. 37; Article 45; Nov. 2015; pp. 965-989. |
Shu, Z., et al.; “EyeOpener: Editing Eyes in the Wild;” ACM Transactions on Graphics; vol. 36; No. 1; Article 1; Sep. 2016; pp. 1-13. |
Wang, K., et al.; “A Hierarchical Generative Model for Eye Image Synthesis and Eye Gaze Estimation;” pp. 1-9. |
Dong, H., et al.; “Soft-GatedWarping-GAN for Pose-Guided Person Image Synthesis;” 32nd Conference on Neural Information Processing Systems; pp. 1-11. |
Yip, B.; “Eye Contact Rectification in Video Conference With Monocular Camera;” Mar. 2007; pp. 1-283. |
Kononenko, D.; “Learnable Warping-Based Approach To Image Re-Synthesis With Application to Gaze Redirection;” 2017; pp. 1-131. |
Number | Date | Country | |
---|---|---|---|
20200371586 A1 | Nov 2020 | US |
Number | Date | Country | |
---|---|---|---|
62849999 | May 2019 | US |