The present invention relates to a method for generating intuitive quasi-eigen faces.
The facial expressions of animated characters play a central role in delivering the story. For an animation studio, therefore, the ability to generate expressive and plausible facial expressions is a critical skill. Nevertheless, as yet no standard procedure for generating facial expressions has been established; when facial animations are required, a whole gamut of approaches are mobilized, ranging from labor-intensive production work to state-of-the-art technical supports. The invention proposes a small but very useful innovation in the area of 3D facial animation, which can be adopted in a wide range of facial animation productions.
Probably, the most popular approach currently used in facial animation productions is the so-called blendshape technique, which synthesizes expressions by taking a linear combination of a set of pre-modeled expressions. The applicants call this expression set the expression basis. Many commercial animation packages such as Maya and Softimage support blendshape-based facial animation. The technique the applicants develop in the present invention is for systems of this type.
A fundamental question in developing a blendshape-based facial animation system is how to control the expressions. One approach is to let the animators manually control the weights assigned to each member of the expression basis set in order to produce the desired expression sequences. Another popular approach that can be taken provided facial motion capture is available, is to set up the system so that the facial animation is driven by a human performance.
In this approach, if the basis is taken from the human subject, in principle the original performance can be reproduced. Although such reproduction may not be needed in an animation production, it has theoretical significance to developers because it can utilized as a benchmark of a blendshape technique: if a method can accurately reproduce the original performance, it can produce other facial animations accurately. The present work assumes that the facial animation system is operated by performance-driven control, but also assumes that manual control can be added whenever the results need to be edited.
Another fundamental issue that must be resolved when developing a blendshape technique is how to form the expression basis. The present work is related to this issue. A casual approach practiced by many animation studios is to use an expression basis comprised of manually modeled, intuitively recognizable key expressions. The basis should contain sufficient elements span the desired range of expressions. The term “basis” is usually reserved for an independent set of elements that spans the entire expressions. In the disclosure of the present application, however, the applicants use the term to loosely mean a set of expressions from which linear combinations are taken. An advantage of using a hand-generated basis is that the combinations of basis elements produce somewhat predictable results. A disadvantage of this approach is that the linear combinations may cover only a portion of full range of facial expressions. When the system is used to reproduce a human performance, the lack of coverage manifests as reproduction errors.
In the context of blendshape-based reproduction of human performances, another well-established approach to obtain the expression basis is to use principal component analysis (PCA). In this method, a set of mutually-orthogonal principal components that spans the expression space is generated by statistical analysis of performance data. Because this technique gives quantitative information on the coverage of each component, by selecting the dominant components the applicants can form an expression basis whose coverage is predictable and greater than that of manually generated bases, resulting in more accurate reproduction of the original performance. A drawback of this approach is that the expressions corresponding to the principal components are visually non-intuitive. Hence animators cannot predict the expression that will be produced by a particular linear combination.
Here the applicants propose a new approach to basis generation that gives coverages comparable to those of statistically generated bases while at the same time having basis elements with meaningful shapes. This approach is based on our observation that a hand-generated expression can be modified such that the resulting expression remains visually close to the original one but its coverage over the expression space increases. It is also based on the relaxation the applicants take that the basis elements do not need to be strictly orthogonal to each other; they can still span the expression space.
A large number of techniques for synthesizing human expressions have been proposed since the pioneering work of [Parke 1972]. Facial expression can be viewed as resulting from the coordination of (mechanical) components such as the jaw, muscles, and skin.
Various researchers have explored physically based techniques for synthesizing facial expressions[Waters 1987; Terzopoulos and Waters 1990; Terzopoulos and Waters 1993; Lee et al. 1995; Wu et al. 1995; Essa and Pentland 1997; Kahler et al. 2001; Choe et al. 2001; Sifakis et al. 2005]. The present work takes a different approach: expressions are synthesized by taking linear combinations of several key expressions. Thus, instead of looking into the physics of facial components, the applicants utilize facial capture data to obtain realistic results. In this section, the applicants review previous work, with a focus on blendshape techniques and performance-driven facial animation techniques.
The blendshape technique has been widely used for expression synthesis. To generate human expressions in real-time, [Kouadio et al. 1998] used linear combinations of a set of key expressions, where the weight assigned to each expression was determined from live capture data. [Pighin et al. 1998] created a set of photorealistic textured 3D expressions from photographs of a human subject, and used the blendshape technique to create smooth transitions between those expressions. [Blanz and Vetter 1999] introduced a morphable model that could generate a 3D face from a 2D photograph by taking a linear combination of faces in a 3D example database. To increase the covering range of the key expressions, [Choe and Ko 2001] let animators sculpt expressions corresponding to the isolated actuation of individual muscles and then synthesized new expressions by taking linear combinations of them.
A critical determinant of the quality of the expression generated by blendshape-based synthesis is the covering range of the key expressions being used. [Chuang 2002] used a PCA-based procedure to identify a set of key expressions that guarantees a certain coverage. However, the resulting principal components did not correspond intuitively meaningful human expressions. [Chao et al. 2003] proposed another basis generation technique based on independent component analysis. In the key expression set produced using this approach, the differences among the elements were more recognizable than those generated by [Chuang 2002]; however, the individual elements in the set still did not accurately represent familiar/vivid human expressions. As a result, conventional keyframe control is not easy using this approach. To enable separate modifications of specific parts of the face in a blendshape-based system, [Joshi et al. 2003] proposed automatic segmentation of each key expression into meaningful blend regions.
[Williams 1990] introduced a performance-driven approach to synthesize human expressions. This approach utilizes the human ability to make faces and has been shown to be quite effective for controlling high-DOF facial movements. The uses of this approach for blendshape-based reproduction of facial performances were introduced above. [Noh and Neumann 2001; Pyun et al. 2003; Na and Jung 2004; Wang et al. 2004] proposed techniques to retarget performance data to synthesize the expressions of other characters. Recently, [Vlasic et al. 2005] developed a multilinear model that can transfer expressions/speech of one face to other faces.
Another class of performance-driven facial animation techniques is the speech-driven techniques. [Bregler et al. 1997; Brand1999; Ezzat et al. 2002; Chao et al. 2004; Chang and Ezzat. 2005; Deng et al. 2005] are several representative works exploring this research direction.
Accordingly, a need for the method for generating intuitive quasi-eigen faces has been present for a long time. This invention is directed to solve these problems and satisfy the long-felt need.
The present invention contrives to solve the disadvantages of the prior art.
An object of the invention is to provide a method for generating intuitive quasi-eigen faces to form the expression basis for blendshape-based facial animation systems.
Another object of the invention is to provide a method for generating intuitive quasi-eigen faces, which the resulting expressions resemble the given expressions.
Still another object of the invention is to provide a method for generating intuitive quasi-eigen faces, which has significantly reduced reconstruction errors than hand-generated bases.
Let v=v(t)=[v1T, . . . , vNT]T represent the dynamic shape of the 3D face model at time t, wherein T in superscript represents a transpose of a vector. It is a triangular mesh consisting of N vertices, where vi represents the 3D position of the i-th vertex. The applicants assume that the geometry v0=[(v10)T, . . . , (vN0)T]T of the neutral face is given. The applicants also assume that motion capture data are given in a 3N×L matrix Ξ=[v(1), . . . , v(L)], where L is the duration of the motion capture in number of frames. The applicants are interested in finding a set of facial expressions, linear combinations of which span Ξ.
Let ÊH={ê1H, . . . , êNH} be the hand-generated expression basis that is given by the animator. Here, n is the number of elements and êiH is the geometry of the i-th element. Let eiH represent the displacement of êiH from the neutral face, i.e., êiH=eiH−v0. In the invention, the applicants call the set of displacements such as EH={e1H, . . . , eNH} also the (hand-generated) expression basis if it does not cause any confusion. When the weights wiH are given, the applicants synthesize the expression v by
A potential problem of the hand-generated expression basis EH is that linear combinations of the basis elements may not span Ξ. The goal of the invention is to develop a procedure to convert EH into another basis EQE={e1QE, . . . , eNQE}, such that the new basis spans Ξ and each element eiQE visually resembles the corresponding element eiH in EH. The applicants call the elements in the new basis quasi-eigen faces.
According to the present invention, a method for generating intuitive quasi-eigen faces includes steps of: a) representing a dynamic shape of a three dimensional face model with a vector; b) making a hand-generated expression basis; c) converting the hand-generated expression basis into a new expression basis, wherein the new expression basis is a quasi-eigen faces; and d) synthesizing expression with the quasi-eigen faces.
The new expression basis includes a plurality of quasi-eigen faces, wherein the linear combinations of the quasi-eigen faces cover a motion capture data, wherein each of the quasi-eigen faces resembles a corresponding element of the hand-generated expression basis.
The dynamic shape of a three dimensional face model is represented by a vector, v=v(t)=[v1T, . . . , vNT]T, where vi represents the 3D position of the i-th vertex, N is a number of vertices.
The vector v is represented by a triangular mesh including N vertices, and the vector forms a facial mesh data.
The expression v is synthesized by
where the neutral face, v0=[(v10)T, . . . , (vN0)T]T, and the weights, wiH, are given. The hand-generated expression basis is represented by ÊH={ê1H, . . . , ênH}, where n is the number of elements of the hand-generated expression basis and êiH is the geometry of the i-th element.
The method set of displacement is represented by
EH={e1H, . . . , eNH},
where eiH represents the displacement of êiH from the neutral face, eiH=êiH−v0.
The step of converting the hand-generated expression basis into a new expression basis includes steps of: a) forming an approximate hyperplane out of the motion capture data or the facial mesh data; and b) identifying the orthogonal axes that spans the hyperplane.
The step of identifying the orthogonal axes that spans a hyperplane includes a step of using a principal component analysis (PCA).
The motion capture data are given in given in a 3N×L matrix Ξ=[v(1), . . . , v(L)], where N is the number of vertices of mesh representing the motion capture data, where L is the duration of the motion capture in number of frames.
The hyperplane is formed by the cloud of points plotted Ξ regarding each of the expressions in the 3N-dimensional space.
The step of identifying the orthogonal axes that spans the hyperplane includes steps of: a) taking the mean of v, μ=[μ1T, . . . , μNT]T, where the summation is taken over the entire motion capture data Ξ; b) obtaining a centered point cloud, {tilde over (D)}=[{tilde over (v)}(1)T, . . . , {tilde over (v)}(L)T]T, where {tilde over (v)}(i)=v(i)−μ; and c) constructing the covariance matrix C using
The C is a symmetric positive-definite matrix with positive eigenvalues λ1, . . . , λ3N in order of magnitude, with λ1 being the largest.
The method may further include a step of obtaining the eigen faces from m eigenvectors EPCA={e1PCA, . . . , emPCA} corresponding to {λ1, . . . , λm} the principal axes.
The coverage of the principal axes is given by
The method may further include a step of converting the hand-generated expression basis into the quasi-eigen basis, the set of quasi-eigen faces, with the eigenfaces ready.
The step of converting the hand-generated expression basis into the quasi-eigen basis includes steps of: a) computing
wijPCA-to-QE=ejPCA·(eiH−μ),
where i ranges over all the hand-generated elements, and j ranges over all the principal axes; and b) obtaining the quasi-eigen faces by
The method may further include a step of synthesizing a general expression by the linear combination
The weights wiQE takes on both positive and negative values when the eigenfaces are used.
The method may further include a step of taking êiH to represent the full actuation of a single expression muscle with other muscles left relaxed for intrinsically ruling out the possibility of two hand-generated elements having almost identical shapes.
The method may further include steps of: a) looking at the matrix WPCA-to-QE=(wijPCA-to-QE); b) determining if ejPCA is missing in the quasi-eigen basis by testing if
is less than a threshold ε; c) augmenting the basis with ejPCA; d) notifying the animator regarding the missing eigenface ejPCA.
The method may further include a step of retargeting the facial expressions by feeding a predetermined expression weight vector to a deformation bases.
The predetermined expression weight vector is obtained by minimizing
where dj* and eijQE are the displacements of the j-th vertex of v* and eiQE, respectively, from v0.
The advantages of the present invention are: (1) the method provides a method for generating intuitive quasi-eigen faces to form the expression basis for blendshape-based facial animation systems; (2) the resulting expressions resemble the given expressions; and (3) the method makes it possible to significantly reduce reconstruction errors than hand-generated bases.
Although the present invention is briefly summarized, the fuller understanding of the invention can be obtained by the following drawings, detailed description and appended claims.
These and other features, aspects and advantages of the present invention will become better understood with reference to the accompanying drawings, wherein:
If the facial vertices v1, . . . , vN are allowed to freely move in 3D space, then v will form a 3N-dimensional vector space. Let us call this space the mathematical expression space E. However, normal human expressions involve a narrower range of deformation. If the applicants plot each expression in Ξ as a point in 3N-dimensional space, the point cloud forms an approximate hyperplane. The PCA is designed to identify the orthogonal axes that span the hyperplane.
The analogical situation is shown in
The procedure for obtaining the quasi-eigen faces is based on the principal components. Finding the principal components requires the point cloud to be centered at the origin. Let μ=[μ1T, . . . , μNT]T be the mean of v where the summation is taken over the entire motion capture data Ξ. Then, the applicants can obtain a centered point cloud, {tilde over (D)}=[{tilde over (v)}(1)T, . . . , {tilde over (v)}(L)T]T, where {tilde over (v)}(i)=v(i)−μ. Now the applicants construct the covariance matrix C using
C is a symmetric positive-definite matrix, and hence has positive eigenvalues. Let λ1, . . . , λ3N be the eigenvalues of C in order of magnitude, with λ1 being the largest. The m eigenvectors EPCA={e1PCA, . . . , emPCA} corresponding to {λ1, . . . , λm} are the principal axes the applicants are looking for, the coverage of which is given by
The facial expressions in EPCA are called the eigenfaces. Since the coverage is usually very close to unity even for small m (e.g., in the case of the motion capture data used in this disclosure, m=18 covers 99.5% of Ξ), the above procedure provides a powerful means of generating an expression basis that covers a given set Ξ of expressions. A problem of this approach is that, even though the eigenfaces have mathematical significance, they do not represent recognizable human expressions.
In the context of generating the eigenfaces, the applicants can now describe the method to convert the hand-generated expression basis into the quasi-eigen basis (i.e., the set of quasi-eigen faces). This method is based on our observation that the hand-generated elements may lie out of the hyperplane. In the analogical situation drawn in
A simple fix to the above problem would be to project the handgenerated elements onto the hyperplane; the quasi-eigen faces the applicants are looking for in this disclosure are, in fact, the projections of the hand-generated basis elements. To find the projection of a handgenerated element onto each principal axis, the applicants first compute
wijPCA-to-QE=ejPCA·(eiH−μ), (3)
where i ranges over all the hand-generated elements, and j ranges over all the principal axes. Now, the applicants can obtain the quasi-eigen faces by
With the quasi-eigen basis, a general expression is synthesized by the linear combination
applicants would note that in most blendshape-based facial animation systems, the weights are always positive and, in some cases, are further constrained to lie within the range [0;1] in order to prevent extrapolations. When the eigenfaces are used, however, the weights wiQE is supposed to take on both positive and negative values. The weights of the quasi-eigen basis should be treated like the eigenfaces: even though they are not orthogonal, their ingredients are from an orthogonal basis. Allowing negative weights obviously increases the accuracy of the reproduction of a performance. Although keyframe-animators would not be familiar with negative weights, allowing weights to take on negative values can significantly extend the range of allowed expressions.
The projection steps of Equations 3 and 4 will modify to the hand-generated elements. The applicants need to assess whether the new expressions are visually close to the original ones. If a hand-generated expression lies on the hyperplane (or, is contained in the motion capture data), then it will not be modified by the projection process. When a hand-generated expression is out of the hyperplane, however, the projection will introduce a minimal Euclidean modification to it. Although the scale for visual differences is not the same as that of Euclidean distance, small Euclidean distances usually correspond to small visual changes.
Another aspect that must be checked is the coverage of EQE. In the analogical case shown in
Preventive Treatments: The applicants can guide the sculpting work of the animator so as to avoid overlap among the hand-generated expressions. For example, the applicants can take êiH to represent the full actuation of a single expression muscle with other muscles left relaxed, which intrinsically rules out the possibility of two hand-generated elements having almost identical shapes [Choe and Ko 2001]. For this purpose, animators can refer to reference book showing drawings of the expressions corresponding to isolated actuation of individual muscles [Faigin 1990]. The facial action coding system [Ekman and Friesen 1978] can also be of great assistance constructing non-overlapping hand-generated expression bases.
Post-Treatments: In spite of the above preventive treatments, the quasi-eigen basis may leave out a PCA-axis. Situations of this type can be identified by looking at the matrix WPCA-to-QE=(wijPCA-to-QE). If
is less than a threshold ε, the applicants conclude that ejPCA is missing in the quasi-eigen basis. In such a case, the applicants can simply augment the basis with ejPCA, or, can explicitly notify the animator regarding the missing eigenface ejPCA and let him/her make (minimal) modification to it so that its projection can be added to the keyframing basis as well as the quasi-eigen basis.
To test the proposed method, the applicants obtained a set of facial capture data, and modeled a hand-generated expression basis, based on the actuation of the expression muscles. The applicants followed the procedure described in the previous section and produced the quasi-eigen basis from the hand-generated expression basis.
5.1 Capturing the Facial Model and Performance
The applicants captured the performance of an actress using a Vicon optical system. Eight cameras tracked 66 markers attached to her face, and an additional 7 markers that were attached to her head to track the gross motion, at a rate of 120 frames per second. The total duration of the motion capture was L=35,000 frames. The applicants constructed the 3D facial model using a Cyberware 3D scanner. The applicants established the correspondence between the 3D marker positions and the geometrical model of the face using the technique that was introduced by Pighin et al. [1998].
5.2 Preparing the Training Data
The motion capture data is a sequence of facial geometries. The applicants convert the marker positions of each frame obtained into a facial mesh. For this, the applicants apply an interpolation technique that is based on the radial basis function. The technique gives the 3D displacements of the vertices that should be applied to the neutral face.
5.3 Preparing the Hand-Generated Expression Basis
The applicants performed PCA on the data obtained in Section 5.2. Covering 99.5% of X corresponded to taking the first m=18 principal components. The applicants asked animators to sculpt a hand-generated expression basis ÊH={ê1H, . . . , êNH} consisting of n=18 elements.
If the elements are clustered in E, then their projections will also be clustered in the hyperplane; this will result in poor coverage, requiring the hand-generation of additional basis elements. To reduce the hand-work of the animators, the applicants guided the sculpting work by considering the size and location of the expression muscles, so that each basis element corresponds to the facial shape when a single expression muscle is fully actuated and all other muscles relaxed.
In our experiment, the applicants made 18 hand-generated expressions. Six elements are for the actuation of muscles in the upper region, 12 are for muscles in the lower region.
5.4 Obtaining the Quasi-Eigen Faces
Starting from the given hand-generated basis, the applicants followed the steps described in Section 4 to obtain the quasi-eigen faces. A selection of the quasi-eigen expressions are shown in
Running the preprocessin steps, which included the PCA on 6,000 frames of training data, took 158 minutes on a PC with an Intel Pentium 4 3.2 GHz CPU and Nvidia geforce 6800 CPU. After the training was complete, the applicants could create quasi-eigen faces in realtime.
5.5 Analysis
Now, the applicants approximate each frame of □□with a linear combination of the quasi-eigen faces. Let
be the reconstruction of a frame, and let v*=v0+d*_be the original expression of Ξ, where d* is the 3N-dimensional displacement vector from the neutral expression. The applicants find the n-dimensional weight vector wQE=(w1QE, . . . , wnQE) by minimizing
where dj* and eijQE are the displacements of the j-th vertex of v* and eiQE, respectively, from v0. The applicants solve equation 5 using the quadratic programming, which required about 0.007 second per frame. To evaluate the accuracy of the reproduction, the applicants used the following error metric:
For comparison, the above analysis was also performed using the bases EH and EPCA. The a values obtained using the three bases were αQE=0.72%, αH=5.2%, and αPCA=0.62%. The results thus indicate that, in terms of coverage, EQE is slightly inferior to EPCA and far better than EH.
Qualitative comparison of the reconstructions can be made in the accompanying video.
In the present invention, the applicants have presented a new method for generating expression bases for blendshape-based facial animation systems. Animation studios commonly generate such bases by manually modeling a set of key expressions. However, hand-generated expressions may contain components that are not part of human expressions, and reconstruction/animation by taking linear combinations of these expressions may produce reconstruction errors or unrealistic results. On the other hand, statistically-based techniques can produce high-fidelity expression bases, but the basis elements are not intuitively recognizable. Here the applicants have proposed a method for generating so-called quasi-eigen faces, which have intuitively recognizable shapes but significantly reduced reconstruction errors compared to hand-generated bases.
In the present invention the applicants have focused on the reproduction of captured performances. This approach was taken based on our experience in facial animation that, in most cases, technically critical problems reside in the analysis part rather than in the synthesis part. If the analysis is performed accurately, then expression synthesis, whether it be reproduction or animation of other characters, will be accurate. The experiments performed in the present work showed that the proposed technique produces basis elements that are visually recognizable as typical human expressions and can significantly reduce the reconstruction error. Even though the applicants did not demonstrated in the disclosure, the proposed technique can be effectively used for synthesizing expressions of other characters than the captured subject.
The proposed technique is an animator-in-the-loop method whose results are sensitive to the hand-generated expressions provided by the animator. If the animator provides inadequate expressions, the projection will not improve the result. The applicants have found that a musclebased approach to the modeling of the hand-generated expressions, as used in Section 5, effectively extends the coverage of the basis. Application of the proposed projection to hand-generated elements of this type reduces the reconstruction error. The muscle-based approach is not, however, the only way to obtain non-overlapping hand-generated expressions. A better guidance may be developed in the future, which can help the animator sculpt intuitively meaningful but non-overlapping faces.
According to the present invention, a method for generating intuitive quasi-eigen faces includes steps of: a) representing a dynamic shape of a three dimensional face model with a vector (S1000); b) making a hand-generated expression basis (S2000); c) converting the hand-generated expression basis into a new expression basis (S3000), wherein the new expression basis is a quasi-eigen faces; and d) synthesizing expression with the quasi-eigen faces (S4000) as shown in
The new expression basis includes a plurality of quasi-eigen faces, wherein the linear combinations of the quasi-eigen faces cover a motion capture data, wherein each of the quasi-eigen faces resembles a corresponding element of the hand-generated expression basis.
The dynamic shape of a three dimensional face model is represented by a vector, v=v(t)=[v1T, . . . , vNT]T, where vi represents the 3D position of the i-th vertex, N is a number of vertices.
The vector v is represented by a triangular mesh including N vertices. The vector forms a facial mesh data.
The expression v is synthesized by
where the neutral face, v0=[(v10)T, . . . , (vN0)T]T, and the weights, wiH, are given. The hand-generated expression basis is represented by ÊH={ê1H, . . . , ênH}, where n is the number of elements of the hand-generated expression basis and êiH is the geometry of the i-th element.
The method set of displacement is represented by
EH={e1H, . . . , eNH},
where eiH represents the displacement of êiH from the neutral face, eiH=êiH−v0.
The step (S3000) of converting the hand-generated expression basis into a new expression basis includes steps of: a) forming an approximate hyperplane out of the motion capture data or the facial mesh data(S3100); and b) identifying the orthogonal axes that spans the hyperplane (S3200) as shown in
The step (S3200) of identifying the orthogonal axes that spans a hyperplane includes a step (S3210) of using a principal component analysis (PCA) as shown in
The motion capture data are given in given in a 3N×L matrix Ξ=[v(1), . . . , v(L)], where N is the number of vertices of mesh representing the motion capture data, where L is the duration of the motion capture in number of frames.
The hyperplane is formed by the cloud of points plotted Ξ regarding each of the expressions in the 3N-dimensional space.
The step (S3200) of identifying the orthogonal axes that spans the hyperplane includes steps of: a) taking the mean of v, μ=[μ1T, . . . , μNT]T, where the summation is taken over the entire motion capture data Ξ (S3220); b) obtaining a centered point cloud, {tilde over (D)}=[{tilde over (v)}(1)T, . . . , {tilde over (v)}(L)T]T, where {tilde over (v)}(i)=v(i)−μ (S3230); and c) constructing the covariance matrix C using
(S3240) as shown in
The C is a symmetric positive-definite matrix with positive eigenvalues λ1, . . . , λ3N in order of magnitude, with λ1 being the largest.
The method may further include a step (S3250) of obtaining the eigen faces from m eigenvectors EPCA={e1PCA, . . . , emPCA} corresponding to {λ1, . . . , λm}, the principal axes as shown in
The coverage of the principal axes is given by
The method may further include a step (S3260) of converting the hand-generated expression basis into the quasi-eigen basis, the set of quasi-eigen faces, with the eigenfaces ready as shown in
The step (S3260) of converting the hand-generated expression basis into the quasi-eigen basis includes steps of: a) computing
wijPCA-to-QE=ejPCA·(eiH−μ).
where i ranges over all the hand-generated elements, and j ranges over all the principal axes (S3261); and b) obtaining the quasi-eigen faces by
(S3262) as shown in
The method may further include a step (S3263) of synthesizing a general expression by the linear combination
The weights wiQE takes on both positive and negative values when the eigenfaces are used.
The method may further include a step (S3264) of taking êiH to represent the full actuation of a single expression muscle with other muscles left relaxed for intrinsically ruling out the possibility of two hand-generated elements having almost identical shapes as shown in
The method may further include steps of: a) looking at the matrix WPCA-to-QE=(wijPCA-to-QE) (S3265); b) determining if ejPCA is missing in the quasi-eigen basis by testing if
is less than a threshold ε (S3266); c) augmenting the basis with ejPCA (S3267); d) notifying the animator regarding the missing eigenface ejPCA (S3268) as shown in
The method may further include a step (S3269) of retargeting the facial expressions by feeding a predetermined expression weight vector to a deformation bases as shown in
The predetermined expression weight vector is obtained by minimizing
where dj* and eijQE are the displacements of the j-th vertex of v* and eiQE, respectively, from v0.
While the invention has been shown and described with reference to different embodiments thereof, it will be appreciated by those skilled in the art that variations in form, detail, compositions and operation may be made without departing from the spirit and scope of the invention as defined by the accompanying claims.
CHANG, Y.-J., AND EZZAT, T. 2005. Transferable Videorealistic Speech Animation. In Proceedings of ACM SIGGRAPH/Eurographics Symposium on Computer Animation., 143-151.
PYUN, H., KIM, Y., CHAE, W., KANG, H. W., AND SHIN, S. Y. 2003. An example-based approach for facial expression cloning. In Proceedings of ACM SIGGRAPH/Eurographics Symposium on Computer Animation., 167-176.
Number | Name | Date | Kind |
---|---|---|---|
5844573 | Poggio et al. | Dec 1998 | A |
5880788 | Bregler | Mar 1999 | A |
6188776 | Covell et al. | Feb 2001 | B1 |
Number | Date | Country | |
---|---|---|---|
20070236501 A1 | Oct 2007 | US |