The present invention relates to animation systems, especially to an avatar or a puppet animation system driven by facial expression or body posture with 3D camera.
In recent decades, avatars (especially faces) animated by facial expression extracted from real-time input image (captured with web camera) have been developed and published in many technical literatures by using various methods. The core technologies for facial feature extraction used are so called ‘deformable shape extraction’ methods (for example, snake, AAM, CLM . . . , etc.) which track real-time facial expressions to drive ‘avatars’ to act out or mimic the same expression. This type of facial feature extraction work is based on data from 2D images and is easily suffered from environmental or background noises (even in good lighting condition) to distort the extracted facial shape (especially the face border), which may make the extracted facial image result become a peculiar or unusual looking animated ‘avatar’ facial image being displayed on the screen.
Recently, 3D camera has become a reality for commercial market adoption. Although 3D camera can capture a depth map and a color 2D image at one snap shot, the current conventional developed usages are mostly focused on the ‘3D’ aspect of the depth map to extract the necessary information. For example, the skeleton of a body (including the joint points of a hand, a leg, etc.) is extracted to drive a full body puppet to be dancing or striking a ball using a bat in a sport gaming animation system.
Therefore, the problems described in
The present invention relates generally to an animation system integrating face and body tracking for a head only or a full body puppet animation by full use the capability and benefits of a 3D camera. With integration of the 3D data in the depth map to confine a head region of a person as captured in the 2D image together with the rest of the animation system and method of the present invention, the conventional problems as shown in
One aspect of the present invention is directed to a 3D camera human body and facial animation system which includes a 3D camera having an image sensor and a depth sensor with a same fixed focal length and image resolution, an equal field of view (FOV) and an aligned image center. A system software for the 3D camera human body and facial animation system includes a user GUI, an animation module and a tracking module. The system software of the animation system provides the following functions: on-line tracking via the User GUI and command process, and tracking and animation integration; and off-line learning via building an avatar (face, character) model, and tracking parameters learning.
Another aspect of the present invention is directed to an algorithm of object detection for the on-line tracking function of the aforementioned system software for the 3D camera human body and facial animation system which includes the following steps: (1) detecting and assessing a distance of an object in a depth map from a 3D camera; (2) if the object is located near a predefined distance (see
Another aspect of the present invention is directed to another embodiment of a human body and facial animation system with one or more 3D cameras having one or more zoom lens which includes an image sensor with an adjustable focal length f′ and a depth sensor with a fixed focal length f.
Another aspect of the present invention is directed to yet another embodiment of a human body and facial animation system with a plurality of 3D cameras, which includes an image sensor with a fixed focal length f′ and another image sensor with a fixed focal length f. The two different focal lengths f and f′ are predesigned and configured for operating capability at an extended large distance for full body and detailed facial expression image capturing.
These and other features of the present invention will become readily apparent upon further review of the following specification and drawings.
The components in the drawings are not necessarily drawn to scale, the emphasis instead placed upon clearly illustrating the principles of the present invention. Moreover, in the drawings, like reference numerals designate corresponding parts throughout the several views.
a˜1b show an example of a conventional 2D image face tracking algorithm having distorted facial features when being extracted from the facial image result of a person.
a˜2b show an embodiment of a 3D camera animation system with a fixed focal length according to the present invention.
a˜3b show an example of facial animation according to an embodiment of the present invention.
a˜4b shows an example of body animation according to an embodiment of the present invention.
a shows a zoomed face image of a person with the image sensor configured at focal length f′ according to a simulation for another embodiment of the present invention.
b shows a depth map of an avatar being overlaid on the depth map of
One embodiment of a 3D camera animation system 100 with a fixed focal length according to the present invention is shown in
On-line tracking via the following:
(1) the user GUI 40 and a command process, and
(2) tracking and animation integration.
Off-line learning via the following:
(1) an avatar (face, character) model building, and
(2) tracking parameters learning.
a˜3b shows an example of facial animation according to an embodiment of the present invention. In
a˜4b show an example of body animation according to an embodiment of the present invention. Referring to
Referring to
Animation Step (a): At a Distance 1 of 60 cm˜100 cm as measured from the 3D camera to a User 1, a facial animation on the User 1 is performed.
Animation Step (b): At a Distance 2 of 200 cm˜300 cm as measured from the 3D camera to a User 2, a body animation on the User 2 is performed.
Animation Step (c): At another Distance m located between Distance 1 and Distance 2, a facial or hand gesture animation is performed on a User m.
An algorithm using data from the depth map can calculate a target object distance, such as, the Distance 1 for User 1, the Distance 2 for user 2, or the another distance m for User m, and automatically determine which of the animation steps (a), (b), and (c) mentioned above should be selected for usage.
A plurality of resource files that are built during off-line learning via an avatar (face, character) model building, and tracking parameters learning are loaded in step (S4).
One color image (Img) and one depth map (Dm) are respectively captured by the image sensor and the depth sensor of the 3D camera of the 3D camera human body and facial animation system in step (S6).
One object is detected in a depth map captured by a 3D camera, and a distance of the object from a 3D camera to the object is determined in step (S10).
If the distance from a 3D camera to the object is assessed to be at about Distance 1 and is accompanying and corresponding to a very deep background scene, the object is then recognized and identified as a face, and a face tracking procedure is performed in step (S20), so as to obtain a face shape to provide for facial animation for the avatar in step (S25).
If the distance from the 3D camera to the object is assessed to at about Distance 2 and that the object is assessed to resemble a person (human being), the object is then considered to be recognized and identified as a body, and a body tracking procedure is performed in step (S30), so as to obtain the body shape for the body animation of the avatar in the step (S35).
If the distance from the 3D camera to the object is assessed to at about between Distance 1 and Distance 2, a face and hand gesture detection procedure is then performed (S40), so as to obtain both the face shape and the hand shape features for facial/gesture animation of the avatar in the step (S45).
Upon successive iterations of the object detection algorithm for the on-line tracking function for the 3D camera human body and facial animation system, a user can choose to terminate the algorithm based upon personal preference and needs in the step (S60).
Moreover, according to another embodiment of a 3D camera human body and facial animation system, the 3D animation system includes a zoom lens 3D camera. The zoom lens 3D camera includes an image sensor with an adjustable focal length and a depth sensor with a fixed focal length. A strategy for maintaining a distance (D) of the object (O) to be unchanged or constant located at a far distance away from the zoom lens 3D camera for obtaining a combined simultaneous full body and detailed face tracking is achieved in the another embodiment of the present invention. Referring to
According to yet another embodiment of the present invention, a 3D human body and facial animation system includes a 3D camera that has two images sensors, in which each image sensor has a different fixed focal length, namely, one image sensor has a fixed focal length f, and the other image sensor has a fixed focal length f′, is provided. Referring to
The advantages and benefits of the 3D camera human body and facial animation system, the system software thereof, and the algorithm of object detection for the on-line tracking function of the aforementioned system software for the 3D camera human body and facial animation system of the embodiments of the present invention can be seen by means of a simulation example shown in
Although the illustrative embodiments have been described herein with reference to the accompanying drawings, it is to be understood that the present invention is not limited to those precise embodiments, and that various changes and modifications may be effected therein by one of ordinary skill in the pertinent art without departing from the scope or spirit of the present invention. All such changes and modifications are intended to be included within the scope of the present invention as set forth in the appended claims.
Number | Date | Country | |
---|---|---|---|
61550928 | Oct 2011 | US |