This non-provisional application claims priority under 35 U.S.C. § 119(a) on Patent Application No(s). 202210638227.3 filed in China on Jun. 7, 2022, the entire contents of which are hereby incorporated by reference.
This disclosure relates to transition motions, and more particularly to a method of automatically classifying transition motions.
Simulated virtual characters have been widely incorporated in many industries such as robotics, movies, and games. Although each of these industries may require different properties of the simulated character, one that remains crucial for all is the character's ability to perform many motions.
A versatile character is expected to transit from one motion to another, e.g., from canter to jump. In addition, it is expected to perform many different “styles” of transition. For example, when tacking an obstacle, the character might need to jump over a given obstacle to reach its goal. This single example of an obstacle already introduces many problem variations, as there would be cases where the character needs to jump over a tall-hurdle or a wide-gap. Here, a generic solution of performing Jump would be insufficient since the two problems need to be solved differently.
However, generating transition motions for a virtual character is not an easy task. Traditional methods are manual keyframe generation followed by motion interpolation. In media production, a key frame of keyframe is a location on a timeline which marks the beginning or end of a transition. It holds special information that defines where the transition should start or stop. The intermediate frames are interpolated over time between those definitions to create the illusion of motion. To generate different “styles” of transition motions, different sets of keyframes need to be generated, and this increases labor cost and time cost inefficiently.
Accordingly, the present disclosure proposes a control mechanism to integrate user-preferences seamlessly and robustly over the transition motion of virtual characters. Such a reliable control allows the virtual character to efficiently tackle many kinds of obstacles.
According to an embodiment of the present disclosure, a method of automatically classifying transition motion includes following steps performed by a computing device: obtaining a plurality of transition motions, wherein each of the plurality of transition motions is associated with a source motion, a destination motion, and a transition mechanism converting the source motion into the destination motion; extracting a property vector from each of the plurality of transition motions and thereby generating a plurality of property vectors, wherein each of the plurality of property vectors comprises a plurality of transition properties; and performing a classifying algorithm according to the plurality of property vectors to generate a plurality of transition types.
In view of the above, the present disclosure has the following contributions or effects: (1) Allow the user to select different styles of transition motions, and (2) The different styles of transition motions are generated automatically.
The present disclosure will become more fully understood from the detailed description given hereinbelow and the accompanying drawings which are given by way of illustration only and thus are not limitative of the present disclosure and wherein:
In the following detailed description, for purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the disclosed embodiments. According to the description, claims and the drawings disclosed in the specification, one skilled in the art may easily understand the concepts and features of the present invention. The following embodiments further illustrate various aspects of the present invention, but are not meant to limit the scope of the present invention.
In an embodiment of step P1, the transition motion is generated by a complex motion controller. This complex motion controller is configured to control a robotic quadruped animal in a real world or a virtual quadruped animal in a virtual world. The transition mechanism is also called “transition motion tensor” (TMT). The following paragraphs supplement the establishment of the complex motion controller, as well as the details of TMT.
In another embodiment of step P1, the transition motion is a plurality of sets of keyframes. Each set of keyframes corresponds to a transition style. Each keyframe is equivalent to a pose feature. For example, the information contained in each keyframe may be described by a COM (center of mass) height of the character, a forward acceleration and a foot contact pattern, and/or a combination of the above. The COM height is a three-dimensional vector, such as [1.2 m, 0.9 m, 3.0 m]. The forward acceleration is also a three-dimensional vector, such as [121 cm/s2, −89 cm/s2, 78 cm s2]. The foot contact pattern may be, for example, a four-dimensional vector where each element is a Boolean denoting the contact state for each leg at a given frame. For instance, (0, 0, 1, 1) means 2 legs of the quadruped character touching the ground at one exact frame.
In step P2, the computing device 50 extracts a property vector from each transition motion, and thereby generating a plurality of property vectors. In an embodiment, the computing device 50 may collect data of property vectors through multiple sensors disposed on a real robotic animal. Each property vector includes a plurality of transition properties. In an embodiment, these transition properties include the COM height of the character, the forward acceleration, and the foot contact pattern. The property vector is a high-dimensional vector. Based on the type of transition motion, the user can specify which transition properties the property vector should use. For example, the transition motion “canter-jump” is suitable to refer to the COM height, while the transition motion “canter-stand” is suitable to refer to the forward acceleration.
In step P3, the computing device 50 performs a classifying algorithm to generate a plurality of transition types. In an embodiment, the classifying algorithm is hierarchy clustering. In another embodiment, the classifying algorithm is K-means algorithm, and the value of K is generated by the elbow method. The value of K represents that the number of types after the transition motions are classified.
In the embodiment using the K-means algorithm, the details are described as follows. The first step is to randomly select K data points as central points from data points corresponding to the plurality of transition properties (such as COM height) as central points. The second step is to compute distance between every data point and the central point, and assign every data point to one of the K classes according to the distance. Regarding every class, the center point is re-determined by computing an average of all data points in this class. The above process will be repeated for an assigned number of iterations, or until the center points don't change much between iterations.
The following paragraphs further describe the establishment of the complex motion controller and detailed content of TMT.
The complex motion controller described in the following embodiment may be used to control a virtual character in movies, games, or robots in a real world. Please note that the term “controller” throughout the present disclosure is more likely to be considered as “a control module” or “a control unit” that performs data processing, executes commands, executes algorithms, etc., while the present disclosure does not specifically exclude the term representing a physical element.
Specifically, regarding “obtaining a source controller and a destination controller” in step S1, the source controller generates a source motion according to a current state of a character and a control objective, and the destination controller generates a destination motion according to another current state of the character and another control objective. For example, the source motion may be “walk”, the destination motion may be “run” and the complex motion may be a transition from walk to run. The input of the control objective comprises at least one physical control parameter, such as meters per second of running, gravitational acceleration when falling from a higher ground.
The source controller and the destination controller are both template controllers, and details of the template controller are described as follows.
To enable a character to perform a wide array of motions in a simulated physical environment, it is common to train a physics-based controller to accommodate a large motion vocabulary (various types of motions). However, using a single controller to learn the entire vocabulary may require substantial computation as the learning process gets unbearably intricate with many motions. To avoid such a problem, the present disclosure adopts the explicit controller assignment strategy, where each motion is assigned to a single physics-based controller, further referred to as the template controller. This strategy allows for confining the training complexity within each controller, thereby making the process more tractable and independent.
Before training the template controllers, the present disclosure collects reference motion clips corresponding to each motion using the kinematic controller. Within each motion clip, the kinematic character performs the respective motion repeatedly with slight variations over the speed, heading, and height control on each repetition. Then, in order to produce life-like movements in a dynamic environment, the present disclosure trains the template controllers using deep reinforcement learning (DRL), where at a given time step t, the template controller π(at|st, ct) outputs the actions at∈A, given the current state of the character st∈S, and control objective ct; A and S denote the domain of action and current state respectively. The current state st stores the character's position, rotation, velocity, and angular velocity. The high-level control objective includes ct=(σ, θ, ĥ), where σ, θ, and ĥ denote the target movement speed in meters per second, target heading in radian, and target center-of-mass (COM) height in meters, respectively.
One or more embodiments of the present disclosure initializes the template controller by performing imitation learning over the assigned reference motion clip. In the process, the controller's goal is to match the joint positions of the kinematic character and simulated character, using two consecutive frames of the kinematic character as the low-level control objective. Once converged, the present disclosure further fine-tunes the template controller to follow high-level motion control directives, including speed, heading, and COM height. Note that since the values of the control objectives come from the reference motion clip, it is not required to specify all values all the time. For instance, the target COM height can be left unchanged when performing motions such as “Trot” or “Canter,” but is required to control the COM height of the “Jump” motion.
To ensure the robustness of each template controller, the present disclosure introduces external perturbations during the training process, such as throwing various-sized boxes at the character from random directions as shown in
The present disclosure trains the template controllers using proximal policy optimization (PPO), generalized advantage estimator GAE(λ), and multi-step returns TD(λ). To increase sampling efficiency and prevent the controller from getting stuck in bad local optima, the present disclosure adopts the early termination and reference state initialization as proposed by “Xue Bin Peng, Pieter Abbeel, Sergey Levine, and Michiel van de Panne. 2018. DeepMimic: Example-Guided Deep Reinforcement Learning of Physics-Based Character Skills. ACM Transactions on Graphics (TOG) 37, 4 (2018), 143.”.
The present disclosure represents each template controller as a hierarchical policy with lower-level controllers called the primitives. Each of the template controllers uses four primitives except for the Jump motion, which requires eight primitives to account for its additional motion complexity.
The template controllers only allow the character to perform a specific motion, limiting the character's ability to solve complex tasks that may require multiple motions to work coherently. For instance, to jump over a wide hole and quickly reach the destination, the character needs to run fast enough, followed by jumping and running towards the destination. However, knowing when to perform the transition between running and jumping is not a trivial task since the character's state directly affects the transition outcome. As a result, naively switching between the controllers may yield awkward or even unsuccessful transitions. Therefore, the present disclosure proposes a data-driven transition tensor in step S2 that guides the character in successfully transitioning from one motion to another by carefully examining the critical timing of the transitions.
Regarding “determining a transition tensor between the source controller and the destination controller” in step S2, the transition tensor (hereinafter referred to as tensor) has a plurality of indices, one of these indices corresponds to a plurality of phases of the source motion. For example, the source motion is “raising the left hand”, which includes multiple phases such as the rotation of the elbow joint at 0 degree, 1 degree, 2 degrees, 3 degrees . . . to the upper limit of the rotatable angle.
Given that the character is in a particular state with the source controller, as the controller is switched from the source to the destination controller, the destination controller may have never seen this state. While it tries to recover from this unseen state, it consequently generates a new transition motion, which neither exists in the source controller nor the destination controller. That is, the transitions are generated by switching between the pair of controllers.
However, naively switching the controllers yields unstable transitions since the motions vary in difficulties. It is possible to improve the switching process by assigning control objectives compatible with both the source and the destination motions, such as interpolating the movement speed to transition between different locomotion gaits. However, this strategy is ineffective for motions that require more delicate and precise foot-fall timing which is best described with the character's phase label. For instance, the success of a transition between “Canter” and “Jump” relies heavily on the character's foot touching the ground. Therefore, transitioning from “Jump” to “Canter” when the character is mid-air may cause intricacies for the destination controller, leading to a longer time to stabilize, exerting too much force, deviating from the control objectives, or even worse, causing the character to fall.
To describe the likelihood of a successful transition between source and destination motions, the present disclosure formulates a 4-dimensional tensor T to record the outcomes of the transitions, as shown in Equation 1.
T
m,ϕ,n,ω=(η,Δt,e,α) (Equation 1)
The four indices of the tensor T include the source controller m∈V and the destination controller n E V motions, wherein V denotes vocabulary, as well as the source phase ϕ∈[0, 1) and the destination phase ω∈[0, 1). Note that each of the components (η, Δt, e, α) should be dependent on w=(m, ϕ, n, ω), e.g. η≡ηw.
Each element Tw from the tensor is a 4-dimensional vector representing the outcome of a transition at w. The first element of the vector η records the alive state after the transition, in which η=1 denotes a successful transition where the character's head, torso, and back do not touch the ground, and η=0 if the character falls. The second element Δt denotes the duration of transition which begins with the switching process and ends when the destination controller stabilizes. The third element e represents the effort of the transition, for example, a summation of all joint's torque during the transition, as shown in Equation 2.
The PD-controller's torque of joint j∈J at a given time t is denoted as τtj,J denotes the number of joints of the character, and j denotes the joint label. To measure how well the character follows the control objective, the present disclosure defines the speed, heading, and height rewards respectively as Equations 3, 4, and 5.
Here, ∥.∥ denotes the 12-norm, vc and u=(cos(θ), −sin(θ)) respectively denote the character's COM velocity and the target heading projected onto the 2-dimensional plane of motion, and h, ĥ respectively denote the character's COM heights of the source motion and destination motion. The present disclosure then defines the control reward as Equation 6.
Finally, the fourth element α of the tensor T denotes the control accuracy of the character measured by the sum of control rewards between the two stable states of the destination controller ,. The present disclosure measures the control accuracy post-transition under an assumption that there are no data for controlling the accuracy of the transition. The control accuracy is defined as Equation 7.
Before computing the four outcomes of each transition tensor in step S3, the present disclosure evaluates the template controller through robustness tests, which involves, for example, throwing boxes with varying sizes and densities from a random direction every 0.1 seconds. The template controller passes the robustness test when the character survives for at least 10 seconds.
Regarding “calculating a plurality of transition outcomes and recording these transition outcomes according to the indices” in step S3. The computing device use Monte Carlo method to calculate a plurality of outcomes, each outcome comprises the alive state η, transition duration Δt, effort e, and control accuracy α.
To calculate the likelihood of the transitions, the present disclosure populates the tensor by recording millions of transition samples using the Monte Carlo method in a physics-enabled environment. Each pair-wise transition is sampled uniformly across phases of the source motion and destination motion.
Regarding “calculating a plurality of transition qualities according to the transition outcomes” in step S4, wherein each transition quality comprises stability and an outcome value, the outcome value is calculated by the computing device according to the alive state, transition duration, effort, and control accuracy.
As shown in Equation 1, with the 4-dimensional tensor describing the likelihood of transitions between source and destination controllers, the present disclosure unifies the template controllers, allowing the character to perform all motions in the vocabulary V. Users can utilize the unified controller to steer the character when solving more complex tasks. To achieve this, the present disclosure starts with consolidating the four transition outcomes into a single number as Equation 8, wherein Γw denotes the outcome over the index w.
To measure the transition stability, the present disclosure wishes to further ensure consistency of outcomes and alive probability at neighboring samples. For this purpose, the present disclosure first defines a local neighborhood Γw(δ) which is a 2-dimensional sub-tensor of Γ near w, and w∈{m, ϕ±δ, n, ω±δ}, wherein ϕ±δ denotes a plurality of neighboring reference phases of the source motion m over phase ϕ, and ω±δ denotes a plurality of neighboring reference phases of the destination motion n over phase ω.
Then, the present disclosure can calculate the consistency of the transition outcome ζw(δ) as the variance of all samples in Γw(δ).
Similarly, the present disclosure computes the alive probability of a transition ηw(δ) as the proportion of samples within Tw(6) having η=1.
The final form of the transition's stability is shown as Equation 9, wherein β=0.015.
ψw(δ)=ηw(δ)×exp(−βζw(δ)) (Equation 9)
Combining the transition stability and the outcome values, the quality of a transition at w is shown as Equation 10.
Q
w=ψw(δ)×Γw (Equation 10)
Regarding “searching for an optimal transition quality in the plurality of transition qualities for establishing a complex motion controller” in Step s5. The complex motion controller is used to generate a complex motion corresponding to one of the plurality of phases of the source motion.
To generate the transition from the source motion to the destination motion, it needs to navigate through the tensors and search for the best transition. Given the destination motion label n, and the information regarding the source motion m and phase ϕ, the computing device can find the best transition by looking at the sub-tensor Qm,ϕ±ϵ,n, where ϵ is an adjustable parameter for the search space, and locate the destination phase with highest quality value, as shown in
In general, please refer to
The above embodiments describe the method for establishing a complex motion controller, this method enables the character to grow new motions efficiently and robustly without modifying existing motions. Given several physics-based controllers specializing in different motions, the tensor proposed by an embodiment of the present disclosure serves as a guideline for switching between controllers. By querying the tensor for the best transitions, the present disclosure can create a unified controller capable of producing novel transitions with various behaviors, such as slowing down before higher jumps or jumping immediately for better responsiveness. The present disclosure can be applied on both quadrupeds and bipeds, perform quantitative and qualitative evaluations on transition quality, and demonstrate its capability of tackling complex motion planning problems while following user control directives.
In view of the above, the present disclosure has the following contributions or effects: (1) Allow the user to select different styles of transition motions, and (2) The different styles of transition motions are generated automatically.
Number | Date | Country | Kind |
---|---|---|---|
202210638227.3 | Jun 2022 | CN | national |