With the proliferation of gaming, personal website, instant messaging, and virtual reality scenarios, more and more users wish to enter websites and virtual worlds as artistically modified versions of themselves. However, conventional vector-based cartoon generators for making a caricature, gaming figure, or avatar of oneself can end up providing poorly executed or amateurish-appearing results, or may lose various facial qualities that make the user recognizable as a unique individual. Often the conventional techniques provide too much exaggerated caricature. What is needed is a system that maintains or improves the attractive integrity and recognizable qualities of a human face while converting an image of the user's face to a cartoon style.
A face cartooning system is described. In one implementation, the system generates an attractive cartoon face or graphic of a user's facial image. The system extracts facial features separately and applies pixel-based techniques customized to each facial feature. The style of cartoon face achieved resembles the likeness of the user more than cartoons generated by conventional vector-based cartooning techniques. The cartoon faces thus achieved provide an attractive facial appearance and thus have wide applicability in art, gaming, and messaging applications in which a pleasing degree of realism is desirable without exaggerated comedy or caricature.
This summary is provided to introduce exemplary cartoon face generation, which is further described below in the Detailed Description. This summary is not intended to identify essential features of the claimed subject matter, nor is it intended for use in determining the scope of the claimed subject matter.
Overview
Described herein are systems and methods for cartoon face generation. In one implementation, an exemplary system generates a cartoon face from an original image, such as a photo that portrays a user's face. The style of cartoon face resembles the likeness of the person portrayed in the original photo more than cartoons generated by conventional vector-based cartooning techniques. The cartoon faces thus achieved render an attractive facial appearance and thus have wide applicability in art, gaming, and messaging applications in which a cartoon, avatar, or action figure is desired that captures the user's appearance with a pleasing degree of realism but without exaggerated comedy or caricature. For example, a user can insert a cartoon or graphic of the user's own face into a game or an instant messaging forum. The exemplary system achieves pleasing cartoon faces by applying pixel-based methods separately to some parts of the cartooning process.
Exemplary System
Exemplary Engine
The illustrated example cartooning engine 104 includes a face processor 202, a decomposition engine 204, a pixel-based cartoonizer 206, a compositor 208, and an accessories engine 210.
In one implementation, the face processor 202 further includes a face detector 212, a head cropper 214, a contrast enhancer 216, a color normalizer 218, and a face alignment engine 220, which in turn further includes a feature landmarks assignor 222. Feature landmarks are also known as “feature points” in the description below.
The decomposition engine 204 includes a face extractor 224 and a features extractor 226. How the face and features are extracted will be described in greater detail below.
The terms “cartoonizer” and “cartooner” are used herein to mean cartoon-generating engines or cartoon-assisting processes. The pixel-based cartoonizer 206 further includes a skin cartooner 228, a shadow cartooner 230, and a base-head cartooner 232 associated with the skin cartooner 228, that includes a “forehead & ears” geometry engine 234. Further, the cartoonizer 206 includes a features cartooner 236 including a brows processor 238, eyes processor 240, lips processor 242, and inner-mouth processor 244.
The compositor 208, for re-composing the cartoonized facial parts back into a basic cartoon face 208, includes a “head & shadow” combination engine 246 and a “head & features” combination engine 248.
The accessories engine 210 includes a user interface 250 for selecting and rearranging the templates 252, i.e., templates for selecting and adding the accessories introduced above to the basic face 108.
Operation of the Exemplary Engine
Inspired by the skill and technique applied by artists when drawing cartoons, the exemplary cartooning engine 104 separately processes different parts of the face in the original image 106 using operations well-suited to each part, then composes these parts into a basic cartoon face 108 with matte-compositing techniques. As mentioned above, the accessories engine 210 then adds accessories associated with a face, such as neck, hair, eyeglasses, hat, etc., via templates 252 that can be synthesized by a computing device or pre-drawn by artists.
As shown in
1) Face Detection and Image Pre-Processing
In the stage of face detection and image pre-processing 302, there are many face detection techniques and alternatives that can be used to detect and locate a face in the original image 106. For example, the face detector 212 may use conventional face detection techniques, or alternatively may use a simple user interaction, such as dragging a rectangle with a computer mouse to frame or designate the subject face in the original image 106.
During pre-processing, the head cropper 214 delimits the portrayed head 304 (including associated hair, etc.) from the background 306, so that the delimited visual head region can become the object of following processing steps.
Since the original image 106 may be a digital photo captured in various lighting conditions, the contrast enhancer 216 may use an auto-leveling technique to enhance the contrast within the visual head region. The color normalizer 218 can then normalize the color histogram if the color is outside of tolerances.
2) Interactive Face Alignment
The face alignment engine 220 executes face alignment 308 to locate feature landmarks along the contour of different portrayed facial parts: eyes, brows, nose, lips, cheeks, etc. In one implementation, the contours of the landmarked features are approximated with features points—i.e., dots—as shown in
Face alignment 308 is an important underpinning for the cartooning engine 104, since the original image 106 is being separated into different facial parts according to the face alignment results 310. That is, the more accurate the face alignment 308, the more accurately the generated-cartoon 110 will imitate the original image 106. In one implementation, the face alignment engine 220 employs or comprises a Bayesian Tangent Shape Model (BSTM), e.g., that uses a constrained BSTM technique. The BSTM-based face alignment engine 220 is robust and accurate enough to obtain facial alignment results automatically. In one implementation, an ordinary BTSM method is used first to gain an initial alignment result. Then the user can modify positions of some feature points by dragging them to the expected positions. These constraints are added in the BTSM searching strategy to obtain an optimal solution.
Once the face alignment 308 is complete, the face can be separated into different parts using the aligned feature points.
3) Face-Tooning
Face-tooning 312 is a key phase in personalizing the cartoon face generation.
First, assisted by the face alignment result shown in
Second, the features cartoonizer 236 adopts different techniques for each extracted facial part, or, adopts the same technique but with different parameters.
Facial Skin and Base-Head Shape
The skin cartooner 228 aims to produce a base-head 502 of the cartoon face, as shown in
The second phase executed by the skin cartooner 228 in creating a base-head 502 is producing a suitable forehead and ear shape. In one implementation, the forehead and ear shape may be determined from an aligned cheek shape. Thus, in one implementation, the base-head cartooner 232 has a forehead & ears geometry engine 234 that learns an affine transformation from the cheek shape of a reference face to the aligned cheek shape of the aligned original image 106. Then forehead & ears geometry engine 234 applies the same transformation to the forehead and ear shape of the reference face to produce a corresponding forehead and ear shape for the face in the original image 106. Thus, the base-head 502 can be produced along with or after face skin region processing 228.
Shadow Region
The shadow on a face represents 3-dimensional information of the face and becomes an important factor that can influence the likeness between the generated cartoon face 108 and the original image 106. After the shadow cartooner 230 determines the shadow region, the process is straightforward. The shadow cartooner 230 clusters pixels in the shadow region into groups according to their lightness and replaces the color of each pixel with the mean color of the group that the pixel belongs to. The shadow cartooner 230 may also shift the color of the shadow region into a cartoon style, for example, using the same shifting parameters there were used when shifting the facial skin region, because the pixels of the shadow region also belong to the face skin region.
Browse Eyes, Lips, and Inner-Mouth Region
The brows processor 238, eyes processor 240, lips processor 242, and mouth processor 244 take a similar approach for their respective facial regions as that executed for the shadow region, but the number of clusters and shifting parameters may be different for different regions. Additionally, the eyes processor 240 may enlarge the eye regions and their masks to some extent to emphasize the eyes in cartoon face 108, for example, enlarging the eyes 1.1 times, in one implementation. The eyes processor 240 may also enhance contrast of the pixels in the eye regions.
Recombining Facial Parts and Regions
The compositor 208 combines the processed facial parts, e.g., with a matte-compositing technique, in order to obtain the face-tooning result 504. In one implementation, the formulation of matte-compositing is given by Equation (1):
I=aF+(1−a)B (1)
where F is image foreground, B is image background, a is the a-matte mask and I is the composed image. Since there are several composition steps, for each step, I represents the composed result, leading to the final result 504.
The head & shadow combination engine 246 combines the base-head image 502 used as background with the shadow region and its corresponding mask, used as foreground. The head & features combination engine 248 then combines the shadowed base-head 502, used as background, one-by-one with the cartoonized brows, eyes, lips and inner-mouth region and their respective masks, these latter parts used as foreground in the combination. Thus, compositor 208 generates the basic cartoon face result 504.
4) Adding Accessories
Referring back to
Exemplary Methods
Although exemplary systems and methods have been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described. Rather, the specific features and acts are disclosed as exemplary forms of implementing the claimed methods, devices, systems, etc.
Number | Name | Date | Kind |
---|---|---|---|
4823285 | Blancato | Apr 1989 | A |
5995119 | Cosatto et al. | Nov 1999 | A |
6028960 | Graf et al. | Feb 2000 | A |
6061462 | Tostevin et al. | May 2000 | A |
6061532 | Bell | May 2000 | A |
6075905 | Herman et al. | Jun 2000 | A |
6128397 | Baluja et al. | Oct 2000 | A |
6226015 | Danneels et al. | May 2001 | B1 |
6463205 | Aschbrenner et al. | Oct 2002 | B1 |
6556196 | Blanz et al. | Apr 2003 | B1 |
6677967 | Sawano et al. | Jan 2004 | B2 |
6690822 | Chen et al. | Feb 2004 | B1 |
6707933 | Mariani et al. | Mar 2004 | B1 |
6792707 | Setteducati | Sep 2004 | B1 |
6894686 | Stamper et al. | May 2005 | B2 |
6937745 | Toyama | Aug 2005 | B2 |
7039216 | Shum et al. | May 2006 | B2 |
7092554 | Chen et al. | Aug 2006 | B2 |
7167179 | Nozawa | Jan 2007 | B2 |
7859551 | Bulman et al. | Dec 2010 | B2 |
7889551 | Yang et al. | Feb 2011 | B2 |
20020012454 | Liu et al. | Jan 2002 | A1 |
20030016846 | Chen et al. | Jan 2003 | A1 |
20030069732 | Stephany et al. | Apr 2003 | A1 |
20040075866 | Thormodsen et al. | Apr 2004 | A1 |
20050013479 | Xiao et al. | Jan 2005 | A1 |
20050100243 | Shum et al. | May 2005 | A1 |
20050129288 | Chen et al. | Jun 2005 | A1 |
20050135660 | Liu et al. | Jun 2005 | A1 |
20050212821 | Xu et al. | Sep 2005 | A1 |
20060062435 | Yonaha | Mar 2006 | A1 |
20060082579 | Yao | Apr 2006 | A1 |
20060092154 | Lee | May 2006 | A1 |
20060115185 | Iida et al. | Jun 2006 | A1 |
20060203096 | LaSalle et al. | Sep 2006 | A1 |
20060204054 | Steinberg et al. | Sep 2006 | A1 |
20070009028 | Lee et al. | Jan 2007 | A1 |
20070031033 | Oh et al. | Feb 2007 | A1 |
20070091178 | Cotter et al. | Apr 2007 | A1 |
20070171228 | Anderson et al. | Jul 2007 | A1 |
20070237421 | Luo et al. | Oct 2007 | A1 |
20080089561 | Zhang | Apr 2008 | A1 |
20080158230 | Sharma et al. | Jul 2008 | A1 |
20080187184 | Yen | Aug 2008 | A1 |
20090252435 | Wen et al. | Oct 2009 | A1 |
Number | Date | Country |
---|---|---|
W09602898 | Feb 1996 | WO |
WO0159709 | Aug 2001 | WO |
Entry |
---|
Chen et al, “PicToon: A Personalized Image-based Cartoon System”, 2002, Proceedings of ACM Multimedia, pp. 171-178. |
Chen, et al., “PicToon: A Personalized Image-based Cartoon System”, available at least as early as Jun. 13, 2007, at <<http://delivery.acm.org/10.1145/650000/641040/p171-chen.pdf? key 1=641040&key2=1795371811&coll=GUIDE&dl=GUIDE&CFID=21270614&CFTOKEN=18921023>>, ACM, 2002, pp. 171-178. |
Chiang, et al., “Automatic Caricature Generation by Analyzing Facial Features”, available at least as early as Jun. 13, 2007 at <<http://imlab.cs.nccu.edu.tw/paper/dfgaccv2004.pdf, Asia Conference on Comupter Vision, 2004, pp. 6. |
Ruttkay, et al., “Animated CharToon Faces”, available at least as early as Jun. 13, 2007, at <<http://kucg.korea.ac.kr/Seminar/2003/src/PA-03-33.pdf>>, pp. 12. |
“Cartoon Maker v4.71”, Liangzhu Software, retrieved Nov. 5, 2007, at <<http://www.liangzhuchina.com/cartoon/index.htm>>, 3 pgs. |
Chen et al., “Face Annotation for Family Photo Album Management”, Intl Journal of Image and Graphics, 2003, vol. 3, No. 1, 14 pgs. |
Cui et al., “EasyAlbum: An Interactive Photo Annotation System Based on Face Clustering and Re-Ranking”, SIGCHI 2007, Apr./May 2007, 10 pgs. |
Gu et al., “3D Alignment of Face in a Single Image”, IEEE Intl Conf on Computer Vision and Pattern Recognition, Jun. 2006, 8 pgs. |
Hays et al., “Scene Completion Using Millions of Photographs”, ACM Transactions on Graphics, SIGGRAPH, Aug. 2007, vol. 26, No. 3, 7 pgs. |
“IntoCartoon Pro 3.0”, retrieved Nov 5, 2007 at <<http://www.intocartoon.com/>>, Intocartoon.com, Nov. 1, 2007, 2 pgs. |
Jia et al., “Drag-and-Drop Pasting”, SIGGRAPH 2006, Jul. 30-Aug. 30, 2006, 6 pgs. |
Perez et al., “Poisson Image Editing”, ACM Transactions on Graphics, Jul. 2003, vol. 22, Issue 3, Proc ACM SIGGRAPH, 6 pgs. |
“Photo to Cartoon”, retrieved on Nov. 5, 2007 at <<http://www.caricature-software.com/products/photo-to-cartoon.html>>, Caricature Software, Inc., 2007, 1 pg. |
Reinhard et al., “Color Transfer Between Images”, IEEE Computer Graphics and Applications, Sep./Oct. 2001, vol. 21, No. 5, 8 pgs. |
Suh et al., “Semi-Automatic Image Annotation Using Event and Torso Identification”, Tech Report HCIL 2004-15, 2004, Computer Science Dept, Univ of Maryland, 4 pgs. |
Tian et al., “A Face Annotation Framework with Partial Clustering and Interactive Labeling”, IEEE Conf on Computer Vision and Pattern Recognition, Jun. 2007, 8 pgs. |
Viola et al., “Robust Real-Time Face Detection” , Intl Journal of Computer Vision, May 2004, vol. 57, Issue 2, 18 pgs. |
Wang et al, “A Unified Framework for Subspace Face Recognition”, IEEE Transactions on Pattern Analysis and Machine Intelligence, Sep. 2004, vol. 26, Issue 9, 7 pgs. |
Wang et al., “Random Sampling for Subspace Face Recognition”, Intl Journal of Computer Vison, vol. 70, No. 1, Jan. 2006, 14 pgs. |
Xiao et al., “Robust Multipose Face Detection in Images”, IEEE Transactions on Circuits and Systems for Video Technology, Jan. 2004, vol. 14, Issue 1, 11 pgs. |
Zhang et al., “Automated Annotation of Human Faces in Family Albums”, Proc ACM Multimedia, Nov. 2003, 4 pgs. |
Zhang et al., “Efficient Propagation for Face Annotation in Family Albums”, Pro ACM Multimedia, Oct. 2004, 8 pgs. |
Zhao et al., “Automatic Person Annotation of Family Photo Album”, CIVR 2006, Jul. 2006, pp. 163-172. |
Zhao et al., “Face Recognition: A Literature Survey”, ACM Computing Surveys, Dec. 2003, vol. 35, Issue 4, pp. 399-459. |
Zhou et al., “Bayesian Tangent Shape Model: Estimating Shape and Pose Parameters via Bayesian Inference”, Computer Vision and Pattern Recognition, 2003 IEEE, Jun. 2003, 8 pgs. |
Agarwala et al., “Interactive Digital Photomontage”, ACM Transactions on Graphics (TOG)—Proceedings of ACM SIGGRAPH 2004, Mar. 23, 2004, pp. 294-302. |
Non-Final Office Action for U.S. Appl. No. 12/200,361, mailed on Sep. 27, 2011, Fang Wen, “Cartoon Personalization”, 22 pages. |
Office Action for U.S. Appl. No. 12/200,361, mailed on Feb. 7, 2012, Fang Wen, “Cartoon Personalization”, 32 pgs. |
Douglas, Mark, “Combining Images in Photoshop” Department of Fine Arts at Fontbonne University, Aug. 19, 2005, Web 31, retrieved on Jan. 2012 from <<http://fineats.fontbonne.edu/tech/dig—img/bitmap/bm—ci.html, 3 pages. |
Office action for U.S. Appl. No. 12/200,361, mailed on Oct. 15, 2012, Wen et al., “Cartoon Personalization”, 38 pages. |
Office action for U.S. Appl. No. 12/200,361, mailed on Mar. 7, 2013, Wen et al., “Cartoon Personalization ”, 38 pages. |
Number | Date | Country | |
---|---|---|---|
20090087035 A1 | Apr 2009 | US |