SYSTEMS AND METHODS FOR CONTROLLING AN UNDERACTUATED MECHANICAL SYSTEM WITH MULTIPLE DEGREES OF FREEDOM

Information

  • Patent Application
  • 20240393751
  • Publication Number
    20240393751
  • Date Filed
    July 13, 2024
    5 months ago
  • Date Published
    November 28, 2024
    a month ago
Abstract
A method for controlling a mechanical system utilizes an energy-based inverse dynamics model trained to map dynamic states of the mechanical system to corresponding torques for a plurality of actuators of the mechanical system. The method comprises collecting a feedback signal including current states of dynamics of the mechanical system. The method further comprises processing the current states of dynamics with the energy-based inverse dynamics model to produce values of the torques for the plurality of actuators and values of the potential and kinetic energy of the mechanical system. The method further comprises controlling the mechanical system based on the produced values of the torques for the plurality of actuators of the mechanical system and the values of the potential and kinetic energy of the mechanical system.
Description
TECHNICAL FIELD

The present disclosure relates generally to control systems for mechanical manipulators, and more particularly to a system and a method for controlling an underactuated mechanical system having multiple degrees of freedom and different kinds of actuators.


BACKGROUND

A mechanical system, e.g., a robotic manipulator, is configured to track a reference trajectory for performing a task. The task, for example, corresponds to moving an object to a target position, or an assembly operation. To control the mechanical system to track the reference trajectory, a model of dynamics of the mechanical system is utilized. The model of dynamics is a mathematical model that includes equations which define dynamics of the mechanical system. Some approaches use a model of inverse dynamics of the mechanical system to control the mechanical system as the model of inverse dynamics enables controlling of the mechanical system. The model of inverse dynamics expresses joint torques as a function of joint positions, velocities and accelerations. It is desired to formulate an accurate model of the inverse dynamics for controlling the mechanical system.


Under-actuated robots (UR) are an important class of mechanical systems. UR are mechanical systems characterized by fewer control inputs than degrees of freedom (DOF). Such systems are ubiquitous in robotics: examples are manipulators with passive joints, autonomous bicycles and motorcycles, bipedal robots, and most of the aerospace and marine vehicles. For instance, several Reinforcement Learning algorithms have been applied to the UR control problem. These algorithms aim at automatically learning a control law. However, such methods typically require a huge amount of interactions with the system and do not provide any performance guarantees. Even control strategies which combine model learning with classic model-based control methods rely on the inverse dynamics model, which relates torques to the robot trajectories. However, learning the inverse dynamics model is a challenging and to a great extent unexplored task for control of UR systems.


SUMMARY

Some example embodiments are based on the realization that for effective control of mechanical systems such as robots, an accurate model of the dynamics of such a mechanical system is required. The model of dynamics is a mathematical model including equations which define dynamics of the mechanical system. The model of dynamics may be a model of forward dynamics that connects control commands and current states of the different actuators to transitioned states of the different actuators achieved by executing the control commands. Alternately, the model of dynamics may be a model of inverse dynamics that maps the current states and the transitioned states of the different actuators to corresponding torques for the different actuators. For efficient control of such systems, one approach is to use the model of inverse dynamics of the mechanical system since the model of inverse dynamics enables controlling of the mechanical system. The model of inverse dynamics expresses joint torques as a function of joint positions, velocities and accelerations. It is an object of some embodiments to control a mechanical system using a model of inverse dynamics. In this regard, it is also an objective of some embodiments to derive an accurate inverse dynamics model of the mechanical system. The mechanical system may be a robotic system having different actuators and multiple degrees of freedom to track a reference trajectory for performing a task. For example, the robotic system may be a manipulator configured to track the reference trajectory for performing a task of moving an object to a target location. The robotic manipulator includes joints and links. Each joint is actuated by an actuator such as an electric motor. A motion given by the actuator makes the link attached to the joint move.


Generally, an accurate physical model of the inverse dynamics of such mechanical systems is difficult and time-consuming to generate. Conventional model-based approaches which derive parametric models directly from first principles of physics are often limited in performance by both the presence of parametric uncertainty and the inability to describe certain complex dynamics typical of real systems, such as motor friction or joint elasticity. Accordingly, it is advantageous to utilize machine learning-based approaches for deriving inverse dynamics models of such systems. Some machine learning-based approaches in this regard are mainly based on deep neural networks (NN) and Gaussian Process Regression (GPR). Some embodiments realized that in this context, both gray-box and black-box approaches may be suitable. Within gray-box techniques, a model-based component encoding the known dynamics may be combined with a data-driven one, which can compensate for modeling errors and unknown dynamical effects.


However, it is a realization of some embodiments that the performance of these methods strongly depends on the effectiveness of the model-based component, so they still require to derive sufficiently accurate physical models, which might be particularly time-consuming and complex if some parameters are unknown or not known precisely. In contrast, pure black-box methods learn inverse dynamics models directly from experimental data, without requiring deep knowledge of the underlying physical system. Despite their ability to approximate even complex non-linear dynamics, pure black-box methods typically suffer from low data efficiency and poor generalization properties: learned models require a large amount of samples to be trained and extrapolate only within a neighborhood of the training trajectories.


Some embodiments realized that to overcome the aforementioned limitations of black-box techniques in the context of NN and for the GPR framework, a promising class of solutions is represented by Physics Informed Learning (PIL), which proposes to embed insights from physics as a prior in black-box models. Instead of learning the inverse dynamics in an unstructured manner, which makes the problem unnecessarily complex, some embodiments embed physical properties in the model to improve generalization and data efficiency. Accordingly, some embodiments provide a PIL model for inverse dynamics identification of mechanical systems based on GPR. A standard approach for applying GPR to the inverse dynamics identification involves modeling each joint torque directly with a distinct Gaussian Process (GP), assuming the GPs independent of one another given the current joint position, velocity, and acceleration. However, some embodiments realized that such single-output approaches ignore the correlations between the different joint torques imposed by the Lagrangian equations, which in turn limits generalization and data efficiency.


Inspired from these realizations, some embodiments provide a multi-output GPR estimator based on a novel kernel function referred to as a Lagrangian Inspired Polynomial kernel (LIP), which exploits Lagrangian mechanics to model the correlations between the different joint torques, instead of modelling each joint torque with a distinct Gaussian Process (GP), assuming the GPs independent of one another given the current joint position, velocity, and acceleration. Some embodiments exploit the fact that the dynamics equations are linear with respect to the Lagrangian, to obtain the Gaussian Processes (GPS) of the torques by applying a set of linear operators to the GPs of the potential and kinetic energy of the mechanical system. Some embodiments recognize that the kinetic and potential energy are polynomial functions in a suitable input space and derive a polynomial kernel that encodes this property.


Accordingly, some example embodiments derive the LIP estimator as a black-box multi-output GPR model which encodes the symmetries typical of Lagrangian systems. The LIP model estimates the kinetic and potential energy in a principled way, allowing its integration with energy-based control strategies. Since the LIP estimator encodes physical properties in the model, the LIP estimator outperforms state-of-the-art black-box GP estimators as well as NN-based solutions, obtaining better data efficiency and generalization performance.


Some embodiments are also based on the realization that training the model of inverse dynamics is difficult because of lack of structure of the model of inverse dynamics, which fail to describe physical principles that govern dynamics of the multiple degrees of freedom of the mechanical system. For example, the robotic manipulator may have joints that define the multiple degrees of freedom, and dynamics of the joints are correlated to each other. Therefore, there exists a complex and potentially non-linear correlation between torques for the joints needed to track the reference trajectory. Such a correlation is challenging to learn through training. Also, for some special types of mechanical systems such as underactuated robots which have fewer actuators than degrees of freedom, learning inverse dynamics models for the control of such robots is particularly challenging because under actuation further exacerbates the aforementioned problems. For example, torques of the underactuated dimensions are constant signals equal to zero, leading to an ill-posed estimation problem.


Towards this end, some embodiments are directed towards physics-informed model-based solutions for controlling such systems. Some embodiments are based on the recognition that Gaussian Processes Regression (GPR) can be used to learn the correlation between torques of the joints. For example, several Gaussian processes-based solutions use GPR to model n torque components; one for each degree of freedom with n independent Gaussian Process (GP) and include a model-based component on a mean function or a covariance. As a result, a covariance matrix for such a GPR model is either diagonal or block diagonal to represent that the correlation between torques of the different actuators is not captured.


Some embodiments are based on the realization that such a deficiency is caused, at least in part, by an attempt to model the torque itself as a Gaussian process. However, some embodiments are based on the realization that the Gaussian process can be designed to model kinetic and potential energy of the mechanical system. In contrast with modeling individual torques, modeling the energy captures mutual effects of the torques of the different actuators on each other, which in turn allows to learn the correlation among the torques of the different actuators. As a result, the covariance matrix capturing correlations between the torques of the different actuators is a full matrix with non-zero elements inside and outside of the diagonal.


Towards this end, some embodiments provide an inverse dynamics model that models energy of the mechanical system with a GPR having a full prior and posterior covariance matrix that captures the correlations between the torques of the different actuators. The inverse dynamics model is trained with machine learning to map the dynamic states of the different actuators and joints to corresponding torques for the different actuators. According to an embodiment, dynamic states of the different actuators and joints are processed with the inverse dynamics model to produce values of the torques for the different actuators of the mechanical system and estimate values of the potential and kinetic energy of the mechanical system. Further, the mechanical system is controlled based on the values of the torques and the values of the potential and kinetic energy of the mechanical system. For instance, control commands are determined based on the values of the torques for the different actuators and the values of the potential and kinetic energy of the mechanical system. Further, the determined control commands are applied to the different actuators to control the mechanical system.


Some example embodiments are directed towards model and control of underactuated robots with energy-based techniques. In this regard, some example embodiments provide techniques for modelling and synthesis of energy-based controllers and control methods for UR systems. Some embodiments provide a Lagrangian Inspired Polynomial (LIP) estimator as a black-box estimator based on Gaussian Process Regression. The LIP estimator relies on a multidimensional, multi-output kernel that embeds the structure of the Euler-Lagrange equation. According to some embodiments, the LIP estimator learns various components of the inverse dynamics map, as well as the kinetic and potential energies of the UR system. Some embodiments utilize the LIP estimator to estimate values of kinetic and potential energies of the UR system, as well as the inertial, Coriolis, and gravity components directly from the overall torque measures. Some embodiments further utilize these properties to derive an energy-based controller for the stabilization and control of complex robots such as UR. The energy-based controller performs a partial feedback linearization on the actuated system and a regulation of the energy to steer the non-actuated system to a trajectory passing through the unstable equilibrium. Once the system is sufficiently close to the target, the control is switched to a Linear Quadratic Regulator (LQR) controller. The LIP model is suitable to implement this kind of controller since it returns the inertia matrix, the Coriolis and gravity torques, energy estimates, and the linearization of the system dynamics required by the LQR.


In order to realize the aforementioned objectives and advantages, various embodiments of this disclosure provide feedback controllers, feedback control methods and systems for controlling a mechanical system such as an underactuated robot in accordance with an inverse dynamics model of the system that is learned through machine learning techniques involving GPR.


According to some embodiments, a feedback controller for controlling a mechanical system to perform a task or to track a reference trajectory for performing a task is provided. The mechanical system has multiple degrees of freedom and comprises a plurality of actuators. The feedback controller comprises a memory configured to store an energy-based inverse dynamics model and computer program instructions and a processor configured to execute the instructions for controlling the mechanical system. The energy-based inverse dynamics model is trained with machine learning to map dynamic states of the mechanical system to corresponding torques for the plurality of actuators. In this regard, the energy-based inverse dynamics model is configured to model potential and kinetic energy of the mechanical system as Gaussian Processes of the dynamic states and derive Gaussian Processes for the torques from the Gaussian Processes of the dynamic states of the mechanical system based on physics of relationship between the torques and the potential and kinetic energy of the mechanical system. The processor executes the instructions to collect a feedback signal of an operation of the mechanical system, the feedback signal including current states of dynamics of the mechanical system indicative of a position, a velocity, and an acceleration of each joint of the plurality of joints of the mechanical system. The processor is further configured to process the current states of dynamics with the energy-based inverse dynamics model to produce values of the torques for the plurality of actuators and values of the potential and kinetic energy of the mechanical system. The processor controls the mechanical system based on the produced values of the torques for the plurality of actuators of the mechanical system and the values of the potential and kinetic energy of the mechanical system.


According to some other embodiments, a method for controlling a mechanical system to track a reference trajectory for performing a task is provided. The mechanical system has multiple degrees of freedom and comprises a plurality of actuators. The method utilizes an energy-based inverse dynamics model trained with machine learning to map dynamic states of the mechanical system to corresponding torques for the plurality of actuators. In this regard, the energy-based inverse dynamics model is configured to model potential and kinetic energy of the mechanical system as Gaussian Processes of the dynamic states and derive Gaussian Processes for the torques from the Gaussian Processes of the dynamic states of the mechanical system based on physics of relationship between the torques and the potential and kinetic energy of the mechanical system. The method comprises collecting a feedback signal of an operation of the mechanical system, the feedback signal including current states of dynamics of the mechanical system indicative of a position, a velocity, and an acceleration of each joint of the plurality of joints of the mechanical system. The method further comprises processing the current states of dynamics with the energy-based inverse dynamics model to produce values of the torques for the plurality of actuators and values of the potential and kinetic energy of the mechanical system. The method further comprises controlling the mechanical system based on the produced values of the torques for the plurality of actuators of the mechanical system and the values of the potential and kinetic energy of the mechanical system.





BRIEF DESCRIPTION OF THE DRAWINGS

The presently disclosed embodiments will be further explained with reference to the following drawings. The drawings shown are not necessarily to scale, with emphasis instead generally being placed upon illustrating the principles of the presently disclosed embodiments.



FIG. 1A illustrates a block diagram of a controller for controlling a mechanical system, according to some embodiments;



FIG. 1B illustrates a method for controlling the mechanical system of FIG. 1A, according to some embodiments;



FIG. 2A illustrates a robotic manipulator, according to some embodiments;



FIG. 2B illustrates one example of a full covariance matrix for an underactuated mechanical system, according to some embodiments;



FIG. 3 illustrates a block diagram of a method for formulating an inverse dynamics model, according to some embodiments;



FIG. 4 illustrates a block diagram showing training of hyperparameters of a Lagrangian polynomial kernel, according to an embodiment;



FIG. 5A illustrates a flow diagram of a method for estimation of kinetic energy of the mechanical system, according to some embodiments;



FIG. 5B illustrates a flow diagram of a method for estimation of potential energy of the mechanical system, according to some embodiments;



FIG. 6 illustrates a block diagram of a system for controlling an underactuated mechanical system, according to some embodiments;



FIG. 7 illustrates a flow diagram of a method for anomaly detection based on the estimated kinetic energy and the potential energy of the underactuated mechanical system of FIG. 6, according to some embodiments;



FIG. 8 illustrates a framework for generating a motion plan that consumes a minimum amount of energy for performing a task, according to some embodiments; and



FIG. 9 is a schematic illustrating a computing device for implementing various hardware components, according to some embodiments.





While the above-identified drawings set forth presently disclosed embodiments, other embodiments are also contemplated, as noted in the discussion. This disclosure presents illustrative embodiments by way of representation and not limitation. Numerous other modifications and embodiments can be devised by those skilled in the art which fall within the scope and spirit of the principles of the presently disclosed embodiments.


DETAILED DESCRIPTION

The following description provides exemplary embodiments only, and is not intended to limit the scope, applicability, or configuration of the disclosure. Rather, the following description of the exemplary embodiments will provide those skilled in the art with an enabling description for implementing one or more exemplary embodiments. Contemplated are various changes that may be made in the function and arrangement of elements without departing from the spirit and scope of the subject matter disclosed as set forth in the appended claims.


Specific details are given in the following description to provide a thorough understanding of the embodiments. However, understood by one of ordinary skill in the art can be that the embodiments may be practiced without these specific details. For example, systems, processes, and other elements in the subject matter disclosed may be shown as components in block diagram form in order not to obscure the embodiments in unnecessary detail. In other instances, well-known processes, structures, and techniques may be shown without unnecessary detail in order to avoid obscuring the embodiments. Further, like-reference numbers and designations in the various drawings may indicate like elements.


Also, individual embodiments may be described as a process which is depicted as a flowchart, a flow diagram, a data flow diagram, a structure diagram, or a block diagram. Although a flowchart may describe the operations as a sequential process, many of the operations can be performed in parallel or concurrently. In addition, the order of the operations may be re-arranged. A process may be terminated when its operations are completed but may have additional steps not discussed or included in a figure. Furthermore, not all operations in any particularly described process may occur in all embodiments. A process may correspond to a method, a function, a procedure, a subroutine, a subprogram, etc. When a process corresponds to a function, the function's termination can correspond to a return of the function to the calling function or the main function.


Furthermore, embodiments of the subject matter disclosed may be implemented, at least in part, either manually or automatically. Manual or automatic implementations may be executed, or at least assisted, through the use of machines, hardware, software, firmware, middleware, microcode, hardware description languages, or any combination thereof. When implemented in software, firmware, middleware or microcode, the program code or code segments to perform the necessary tasks may be stored in a machine-readable medium. A processor(s) may perform the necessary tasks.



FIG. 1A illustrates a block diagram 100 of a controller 101 for controlling a mechanical system 109, according to some embodiments of the present disclosure. The controller 101 is communicatively coupled to the mechanical system 109. The controller 101 includes a processor 103 and a memory 105. The processor 103 may be a single core processor, a multi-core processor, a computing cluster, or any number of other configurations. The memory 105 may include random access memory (RAM), read only memory (ROM), flash memory, or any other suitable memory systems. Additionally, in some embodiments, the memory 105 may be implemented using a hard drive, an optical drive, a thumb drive, an array of drives, or any combinations thereof.


According to some embodiments, the controller 101 controls the mechanical system 109 to track a reference trajectory for performing a task. The mechanical system 109 has multiple degrees of freedom (DoF) and comprises a plurality of joints that can are movable. In this regard, the mechanical system 109 comprises a plurality of actuators and one or more joints of the system 109 are movable by one or more of the actuators. According to some embodiments, the mechanical system 109 may be an underactuated system such as an underactuated robot where at least one joint of the underactuated system is not directly movable by any actuator. “Underactuated” in the present disclosure may correspond to a jointed mechanism in which not all of the joints of the mechanism are actuated, i.e., some of the joints are passive or unactuated. An under-actuated robot means a robot whose number of degrees of freedom of motion, which can be directly controlled by the mounted actuator, is less than the degree of freedom of motion of the robot. According to some embodiments, the mechanical system 109 may have all joints directly movable by at least one actuator. However, in some special applications, the mechanical system 109 may be converted into an underactuated system. For example, when at least one actuator associated with at least one joint does not move the corresponding joint, the system 109 may operate as an underactuated system. Such scenarios may arise when one or more actuators are deliberately not fired but the corresponding joint is intended to be moved for performing a task. Alternately or additionally, such scenarios may also arise when one or more actuators malfunction due to operational reasons or due to breakdown in the underlying control system or communication system of such an actuator.


The controller 101 may also be coupled to suitable interfaces to collect a feedback signal for an operation of the mechanical system 109. The feedback signal may include current state of dynamics of the mechanical system 109. The current state of dynamics of the system 109 may be indicative of a position, a velocity, and an acceleration of each joint of the plurality of joints of the mechanical system 109. In this regard, the controller 101 may interface with suitable sensing circuitry that provides measures of the current state of dynamics of the system 109. In some embodiments, the controller 101 may compute the current state of dynamics of the system 109 from one or more observations pertaining to the operation of the mechanical system 109.


The energy based inverse dynamics model 107 defines a mathematical model of the inverse dynamics of the mechanical system 109. The model 107 maps the current states and the transitioned states of the different actuators to corresponding torques for the different actuators. The model 107 expresses joint torques as a function of joint positions, velocities and accelerations. Some embodiments utilize machine learning-based approaches for deriving the energy-based inverse dynamics model 107 of the system 109. The model 107 embeds physical properties between torques of the actuators in the model to improve generalization and data efficiency. In this regard, some embodiments provide the model 107 as a physics-informed-learned model for inverse dynamics identification of the mechanical system 109 based on Gaussian Process Regression (GPR). To model the correlations between the different joint torques, some embodiments realize the energy based inverse dynamics model 107 as a multi-output GPR estimator based on a novel kernel function referred to as a Lagrangian Inspired Polynomial kernel (LIP) which exploits Lagrangian mechanics to model the correlations between the different joint torques, in addition to modelling each joint torque with a distinct Gaussian Process (GP), assuming the GPs independent of one another given the current joint position, velocity, and acceleration. Details of the structure and training of the model 107 are described later in the disclosure.


The controller 101 processes the feedback signal to produce values of torques for the different actuators of the system 107. In some embodiments, the produced values of the torques may be translated into control commands specifying values of currents, voltages or other physical parameters of the actuators. The control commands may be applied to the different actuators. The control commands cause the different actuators to change their states to track the reference trajectory. The changed states of the different actuators may be input to a feedback controller 111. Based on the changed states of the different actuators, the feedback controller 111 provides a correction to the values of the torques to compensate for errors in the positions and the velocities of the joints to accurately track the reference trajectory.



FIG. 1B illustrates a method 150 for controlling the mechanical system 109 of FIG. 1A by the controller 101, according to some embodiments. FIG. 1B will be described with reference to FIG. 1A. The current states of the dynamics of the mechanical system are obtained 152 by the controller 101. The current state of dynamics of the system 109 may be indicative of a position, a velocity, and an acceleration of each joint of the plurality of joints of the mechanical system 109. The current state of the dynamics of the system 109 is processed 154 with the trained energy-based inverse dynamics model 107. For example, the states of the different actuators and/or joints may be applied as input to the inverse dynamics model 107 to produce values of the torques for the actuators of the mechanical system 109 and values of the potential and kinetic energy of the mechanical system 109. Towards this end, the energy-based inverse dynamics model 107 may be trained with machine learning to map dynamic states of the mechanical system 109 to corresponding torques for the plurality of actuators. Thus, the energy-based inverse dynamics model 107 models the potential and kinetic energy of the mechanical system 109 as Gaussian Processes of the dynamic states. The model 107 derives Gaussian Processes for the torques from the Gaussian Processes of the dynamic states of the mechanical system 109 based on physics of relationship between the torques and the potential and kinetic energy of the mechanical system 109.


Using the values of the torques and the values of the kinetic energy and potential energy of the system 109, control commands may be generated to control 156 the mechanical system 109. For example, the determined control commands may be applied to the different actuators. The control commands cause the different actuators to change their states to track the reference trajectory. The changed states of the different actuators may be input to the feedback controller 111. Based on the changed states of the different actuators, the feedback controller 111 may be configured to provide a correction to the values of the torques to compensate for errors in the positions and the velocities of the joints to accurately track the reference trajectory.


In an embodiment, the mechanical system 109 corresponds to a robotic manipulator having different actuators and multiple degrees of freedom to track a reference trajectory for performing a task. An example robotic manipulator is described next with reference to FIG. 2A. For example, the robotic manipulator 200 is configured to track a reference trajectory 205 for performing a task of moving an object 207 to a target location 209. In another example, the robotic manipulator 200 is configured to track a reference trajectory for performing an assembly operation. The robotic manipulator 200 includes joints such as joints 201a, 201b, and 201c. The robotic manipulator 200 further includes links, such as a link 203a, a link 203b, and a link 203c. Each joint is actuated by an actuator such as an electric motor. An action given by the actuator makes the link attached to the joint move. The movement can be rotational or translational according to a type of the joint. The type of the joint may be a revolute joint or a prismatic joint. The actuator takes as input a desired position of the joint and outputs an action that can be current, torque or a quantity that can be transformed to torque which causes the joint to move to the desired position.


Further, the different actuators are equipped with position sensors (such as encoders) that can measure current positions of the joints. In some embodiments, states of the different actuators are defined as positions of the joints. In some other embodiments, the states of the different actuators are defined as a combination of the positions of the joints, velocities of the joints, and/or accelerations of the joints.


According to some alternate embodiments, the manipulator 200 may be an underactuated manipulator where at least one joint of the manipulator 200 is only indirectly movable by the actuators of the manipulator. Such a joint may be referred to as an unactuated joint. For example, the unactuated joint may not be associated with a corresponding actuator(s). Alternately, the unactuated joint may have a corresponding actuator that has malfunctioned due to operational reasons. The absence of an actuator for one or more joints or presence of an unactuated joint in the robotic manipulator operationally makes it an underactuated mechanical system that is characterized by fewer control inputs than degrees of freedom. Such systems are ubiquitous in robotics: examples are manipulators with passive joints, autonomous bicycles and motorcycles, bipedal robots, and most of the aerospace and marine vehicles.


For such underactuated robotic systems, learning inverse dynamics models for the control of such robots is particularly challenging. For example, torques of the underactuated dimensions are constant signals equal to zero, leading to an ill-posed estimation problem. It is a realization of several embodiments that Gaussian Processes Regression (GPR) may be used to learn the correlation between torques of the joints. In this regard, some embodiments realize that the Gaussian process can be designed to model energy of the mechanical system. In contrast with modeling individual torques, modeling the energy captures mutual effects of the torques of the different actuators on each other, which in turn allows to learn the correlation among the torques of the different actuators. As a result, the covariance matrix capturing correlations between the torques of the different actuators is a full matrix with non-zero elements inside and outside of the diagonal.



FIG. 2B illustrates one example of a full covariance matrix 211, according to some embodiments of the present disclosure. Diagonal elements 213, i.e., a1, b2, and c3, and non-diagonal elements 215, i.e., a2, a3, b1, b3, c1, and c2, of the full covariance matrix 211 are non-zero elements. Since the elements of the full covariance matrix 211 are non-zero elements, the full covariance matrix 211 captures the correlations between the torques of the different actuators.



FIG. 3 illustrates a flow diagram of a method 300 for formulating the energy-based inverse dynamics model 107 of FIG. 1A, according to some embodiments. In general, GPR can use different types of kernels to learn the inverse dynamics of the mechanical system 109 by modeling its energy. Some embodiments use physics informed energy kernels that define the energy of the mechanical system 109. The energy kernels increase an accuracy of the inverse dynamics model 107. For example, the energy kernels define the energy using a function of kinetic energy and a function of potential energy. Such kernels advantageously consider laws of preservation of energy of the mechanical system 109.


To that end, at block 301 of the method 300, a Lagrangian function that describes the mechanical system 109 in terms of energies of the mechanical system 109, i.e., kinetic energy and potential energy, is defined. For instance, the Lagrangian function, such as custom-character(q, {dot over (q)}), is defined as a difference between the kinetic energy and the potential energy of the mechanical system 109









(

q
,

q
.


)

=


𝒯

(

q
,

q
.


)

-

𝒱

(
q
)








    • where custom-character(q, {dot over (q)}) and custom-character(q) are, respectively, the kinetic energy and the potential energy of the mechanical system 109 with n degrees of freedom.





At block 303 of the method 300, the kinetic energy custom-character(q, {dot over (q)}) and the potential energy custom-character(q) are defined as two independent zero-mean Gaussian Processes (GPS) with covariance determined by kernel functions custom-character(x, x′) and custom-character(x, x′)







~

GP

(

0
,


k

(

x
,

x



)


)


,


~


GP

(

0
,


k

(

x
,

x



)


)

.








    • where x=[q, {dot over (q)}, {umlaut over (q)}] with q being the joint positions, {dot over (q)} being the joint velocities and {umlaut over (q)} being the joint accelerations. A GP prior on custom-character(q, {dot over (q)}) and custom-character(q) cannot be used directly in GPR to compute posterior distributions since the kinetic energy and the potential energy are not measured. However, starting from the GP prior on the kinetic energy and the potential energy, some embodiments derive a GP prior for the torques of the inverse dynamics by relying on Lagrangian mechanics. Lagrangian mechanics states that inverse dynamics equations













B

(
q
)



q
¨


+

c

(

q
,

q
.


)

+

g

(
q
)

+

τ
~


=
τ

,






    • also named Lagrange's equations, where B(q) is an inertia matrix, c(q, {dot over (q)}) and g (q) account, respectively, for contributions of fictitious forces and gravity, and {tilde over (τ)} are torques due to friction and unknown dynamical effects, are solutions of a set of differential equations of the Lagrangian function custom-character(q, {dot over (q)}). The differential equation of the i-th row is

















dt



(









q
.

i



)


-








q
i




=

τ
i


,






    • where qi, {dot over (q)}i and τi are, respectively, i-th component of q, {dot over (q)} and τ. The Lagrangian equations can be rewritten avoiding explicit differentiation with respect to time using a chain rule, which leads to following linear partial differential equation of custom-character.

















j
=
1

n



(






2








q
.

i







q
.

j







q
¨

j


+





2








q
.

i






q
j







q
.

j



)


-








q
i




=


τ
i

.





At block 305 of the method 300, a Lagrangian operator Gi that maps the Lagrangian function custom-character of the mechanical system 109 to the torques τi of the different actuators, is defined by a set of partial differential equations as










i



=









j
=
1

n



(






2








q
.

i







q
.

j







q
¨

j


+





2








q
.

i






q
j







q
.

j



)


-








q
i




=

τ
i



,

and




τ
=




=



[





1










n






]

T

.







Some embodiments define the model of custom-character as a GP since custom-character and custom-character are two independent GPs. This is because the sum of two independent GPs is a GP, and its kernel is a sum of kernels, namely,








~

GP

(

0
,


k


(

x
,

x



)


)


,




k


(

x
,

x



)

=



k
𝒯

(

x
,

x



)

+



k
𝒱

(

x
,

x



)

.







Further, applying property of GPs and linear operators to τ=custom-charactercustom-character(x), the inverse dynamics is modeled as τ˜GP(0, kτ(x, x′)) a zero mean GP, with covariance function kτ(x, x′)) named Lagrangian polynomial kernel.


To this end, at block 307 of the method 300, the Lagrangian polynomial kernel that defines the inverse dynamics model 107 is computed based on the Lagrangian operator, the kernel function custom-character of the kinetic energy and the kernel function custom-character of the potential energy as








k
τ

(

x
,

x



)

=

[





1


1




k


(

x
,

x



)









1


n




k


(

x
,

x



)



















n


1




k


(

x
,

x



)









n


n




k


(

x
,

x



)





]





Therefore, some embodiments of the present disclosure model the inverse dynamics function as an unknown multi-input multi-output function f(x): R3n→Rn.


Some embodiments are based on the realization that the kernel functions custom-character and custom-character used to define the prior on the potential energy and on the kinetic energy can be formulated as polynomial functions in a space defined by a trigonometric transformation of the state of the mechanical system 109. Such a formulation gives a physics informed structure to the kernel functions with a compact number of parameters to be learned.


Therefore, at block 309, custom-character of the kinetic energy and the kernel function custom-character of the potential energy are characterized as polynomial functions. The kernel functions custom-character and custom-character are defined based on two propositions that characterize custom-character and custom-character as polynomial functions in a space defined by a trigonometric transformation of the state of the mechanical system 109. The state of the mechanical system 109 may include the positions and velocities of the joints of the mechanical system 109. The trigonometric transformation is defined as follows.


Let qi, {dot over (q)}i be vectors including the positions and the velocities of the joints up to index i, respectively:








q
i

=



[


q
1

,


,

q
i


]

T



i



,




q
.

i

=



[



q
.

1

,


,


q
.

i


]

T




i

.







Nr and Np are, respectively, a number of revolute and prismatic joints, with Nr+Np=n. Sets Ir={r1, . . . , rNr} and Ip={p1, . . . , PNp} include the revolute and prismatic joints indexes, respectively. Further, following vectors are defined:












q
c

=




[


cos

(

q

r
1


)

,


,

cos

(

q

r

N
r



)


]

T




,








q
s

=




[


sin

(

q

r
1


)

,


,

sin

(

q

r

N
r



)


]

T




,







q
p

=




[


q

p
1


,


,

q

p

N
p




]

T







.




qcb, qsb and qp b denote b-th element of qc, qs and qp, respectively.


Next, let Iri (resp. Ipi) be a subset of Ir (resp. Ip) composed by indexes lower or equal to i and the vectors qci, qsi, (resp. qp) are defined as restriction of qc, qs (resp. qp) to Iri (resp. Ipi). For the sake of clarity, consider the following example. Let index i be such that rj≤i<rj+1 for some 1≤j<rNr. Then Iri={r1, . . . , rj}∈custom-character, qci=[cos(qr1), . . . , cos(qrj)]Tcustom-character and qsi=[sin(qr1), . . . , sin(qrj)]Tcustom-character. To conclude let qcsb be the vector concatenating the b-th elements of qc and qs, that is, qcsb=[qcb, qsb]T.


According to an embodiment, the kernel function of the potential energy custom-character is a polynomial function in a space defined by a trigonometric transformation custom-character of the state of the mechanical system 109. The following proposition establishes that the potential energy is polynomial with respect to a set of variables custom-character=(qc, qs, qp) that are functions of the joint positions vector q. custom-character is the trigonometric transformation of the state of the mechanical system 109 that is the space over which the potential energy is a polynomial function.


Proposition-1: Consider a manipulator with n+1 links and n joints. Total potential energy custom-character(q) belongs to space custom-character(n)(qc(1), qs(1), qp(1)), namely it is a polynomial function in (qc, qs, qp) of degree not greater than n, such that each element of qc, qs and qp appears with degree not greater than 1. Moreover, for any monomial of the aforementioned polynomial, a sum of degrees of qcb and qsb is equal or lower than 1, namely, it holds deg(qcb)+deg(qsb)≤1.


To comply with constraints on the maximum degree of each term, the kernel function custom-character(x, x′) is defined as a product of Nr+Np inhomogeneous polynomial kernels where Nr kernels have p=1 and each of them is defined on the 2-dimensional input space given by qcsb, b∈Ir, and Np kernels have p=1 and each of them is defined on the 1-dimensional input qpb, b∈Ip. Resulting kernel function for the potential energy is then given by











k
v

(

x
,

x



)

=




b


I
r






k
pk

(
1
)


(


q

cs
b


,

q

cs
b




)






b


I
p






k
pk

(
1
)


(


q
p
b

,

q
p



b



)

.








(
A
)







Each of n kernels accounts for contribution of a distinct joint, and the potential energy kernel custom-character(x, x′) spans custom-character(n)(qc(1), qs(1), qp(1)). Further, since all the Nr kernels defined on qcsb, b∈Ir, have p=1, constraint deg(qcb)+deg(qsb)≤1 is satisfied.


Further, some embodiments are based on the observation that total kinetic energy is a sum of the kinetic energies relative to each link, that is,








𝒯

(

q
,

q
˙


)

=






i
=
1




n




𝒯
i

(

q
,

q
˙


)



,






    • where custom-character(q, {dot over (q)}) is the kinetic energy of link i. The kinetic energy custom-character is a polynomial function with respect to a set of variables custom-character=(qci, qsi, qpi, {dot over (q)}i), which are functions of the joints positions and velocities vectors qi and {dot over (q)}i. Therefore, the following proposition is established for the kinetic energy.





Proposition-2: Consider a manipulator with n+1 links and n joints. The kinetic energy custom-character(q, {dot over (q)}) of link i belongs to custom-character(2i+2)(qc(2)i, qs(2)i, qp(2)i {dot over (q)}(2)i), namely it is a polynomial function in (qci, qsi, qpi, {dot over (q)}i) of degree not greater than 2i+2, such that:

    • (i) each element of qci, qsi, qpi and {dot over (q)}i appears with degree not greater than 2;
      • each monomial has inside a term of the type {dot over (q)}i{dot over (q)}j for 1≤i≤n and i≤j≤n; and
      • in any monomial a sum of degrees of qcb and qsb is equal or lower than 2, namely deg(qcb)+deg(qsb)≤2.


To comply with constraints and properties stated in the above proposition, a kernel function given by a product of i inhomogeneous polynomial kernels and 1 homogeneous kernel is adopted, where

    • |Iri| inhomogeneous kernels have p=2 and each of them is defined on the 2-dimensional input space given by qcsb, b∈Ir;
    • |Ipi| inhomogeneous kernels have p=2 and each of them is defined on the 1-dimensional input qpb, b∈Ip;
    • 1 homogeneous kernel have p=2 and is defined on the i-dimensional input {dot over (q)}i.


Resulting kernel function for the kinetic energy is then given by











k
i
𝒯

(

x
,

x



)

=



k
hpk

(
2
)


(



q
˙

i

,


q
˙




i



)






b


I
r
i






k
pk

(
2
)


(


q

cs
b


,

q

cs
b




)

·




b


I
p
i





k
pk

(
2
)


(


q
p
b

,

q
p



b



)









(
B
)









    • where |Iri|+|Ipi|=i. As all the kernels have p=2, properties (i) and (iii) of the proposition-2 are satisfied. Further, using a homogeneous kernel defined on {dot over (q)}i with p=2 guarantees validity of property (ii). Finally, the kernel function of the kinetic energy, custom-character is defined as











k
𝒯

(

x
,

x



)

=






i
=
1




n





k
i
𝒯

(

x
,

x



)

.






At block 311 of the method 300, the Lagrangian polynomial kernel, kτ, that defines the inverse dynamics model 107 is computed based on the Lagrangian operator custom-character, the kernel function of the potential energy (eqn. A) and the kernel function of the kinetic energy (eqn. B) characterized as the polynomial functions








k
τ

(

x
,

x



)

=


[





𝒢
1



𝒢
1





k


(

x
,

x



)









𝒢
1



𝒢
n




k




(

x
,

x



)



















𝒢
n



𝒢
1




k




(

x
,

x



)









𝒢
n



𝒢
n




k




(

x
,

x



)





]

.







    • where











k


(

x
,

x



)

=




k
𝒯

(

x
,

x



)

+


k
𝒱

(

x
,

x



)


=




k
hp

(
2
)


(



q
˙

i

,


q
˙




i



)






b


I
r
i






k
p

(
2
)


(


q

cs
b


,

q

cs
b




)






b


I
p
i





k
p

(
2
)


(


q
p
b

,

q
p



b



)





+




b


I
r
i






k
p

(
1
)


(


q

cs
b


,

q

cs
b




)






b


I
p
i





k
p

(
1
)


(


q
p
b

,

q
p



b



)












    • adopting for custom-character and custom-character the polynomial functions defined above and custom-characteri, custom-character are elements of the Lagrangian operator custom-character. Proposition-1 characterizes the potential energy custom-character of the whole mechanical system 109, while proposition 2 focuses on the kinetic energy of each link. In general, characterizing the energies for each link and designing a tailored kernel to be combined as executed for custom-character allows for higher flexibility in terms of regularization and for more accurate predictions.






FIG. 4 illustrates training of the one or more hyperparameters, indicated with the symbol θ, of the Lagrangian polynomial kernel, according to an embodiment. Training input data, which includes a number of degrees of freedom nDoF 401, a type of the joint (i.e., revolute joint or prismatic joint) 403, data q, {dot over (q)}, {umlaut over (q)} 405 that are the joint positions, joint velocities and joint accelerations, respectively (i.e. states of a system), and data τ 406, that are the torques, is obtained by executing one or more trajectories in the mechanical system 109, are used to train the one or more hyperparameters of the Lagrangian polynomial kernel that defines the inverse dynamics model 107. The number of degrees of freedom nDoF 401, and type of the joint 403 are inputs that a user specifies, and are the only knowledge required for the mechanical system 109. The data q, {dot over (q)}, {umlaut over (q)} 405 and data τ 406 are obtained by executing trajectories on the mechanical system 109, that may be a sequence of positions if the mechanical system 109 is controlled in position, or a sequence of velocities if the mechanical system 109 is controlled in velocity or a sequence of torques if 109 is controlled in torque. In some embodiments, for example, these trajectories may be the sum of sinusoids in the joint positions q.


In an embodiment, input of the energy-based inverse dynamics model 107 during training is the data q, {dot over (q)}, {umlaut over (q)} 405 and labels are the torques data 406. The one or more hyperparameters can be trained based on any machine learning algorithm. In some embodiments, the one or more hyperparameters are learned with a machine learning algorithm using maximization of marginal likelihood. The one or more hyperparameters are the parameters that define the kernel function of the Gaussian process. For example, in a standard squared exponential kernel the one or more hyperparameters may be scaling factors and length scales of the squared exponential function. In some embodiments, the one or more hyperparameters may be coefficients of the polynomial functions that define the Lagrangian kernel custom-character(x, x′).


According to some embodiments, the inverse dynamics model 107 is a multi-input-multi-output (MIMO) torque estimator model that produces the torques for the different actuators based on the positions of the joints of the mechanical system 109, and velocity and acceleration of the joints providing multiple degrees of freedom. In other words, the positions of the joints, and velocity and acceleration of the multiple degrees of freedom are applied as input to the MIMO torque estimator, and the MIMO torque estimator outputs the torques for the different actuators.


Additionally, or alternatively, in some embodiments, the inverse dynamics model 107 is used to estimate the kinetic energy and the potential energy from torque measurements. For instance, the processor 103 may be configured to process the states of the different actuators with the trained inverse dynamics model 107 to estimate the kinetic energy of the mechanical system 109 and the potential energy of the mechanical system 109. According to some embodiments, after the training of the inverse dynamics 107, the estimate 409 of kinetic energy and the estimate 410 of potential energy may be computed without requiring further training for such computation of the energies. That is, the training of the model 107 does not require labeled data for the kinetic and/or potential energy. Indeed, the hyperparameters of the torque kernel kθτ are the hyperparameters of the Lagrangian Kernel custom-character(x, x′), that is the sum of the potential energy kernel and of the kinetic kernel as described previously in this disclosure. According to some embodiments, the potential energy can be computed by combining the quantity 407 obtained during the estimation of the inverse dynamics of the system, the potential kernel custom-character in 408 and the linear operator that maps the Lagrangian function to torques determined at step 305 of FIG. 3. The quantities 407 and the potential kernel in 408 are evaluated at the data points 405, 406, therefore the data 405, 406 are also necessary to estimate the potential energy. According to some embodiments, the kinetic energy can be computed by combining the quantity 407, obtained during the estimation of the inverse dynamics of the system, the kinetic kernel custom-character in 408 and the linear operator that maps the Lagrangian function to torques determined at step 305 of FIG. 3. The quantities 407 and the kinetic kernel in 408 are evaluated at the data points 405, 406, and therefore the data 405, 406 are also necessary in estimating the kinetic energy. The estimation of the kinetic energy is described in detail later in FIG. 5A and the estimation of the potential energy is explained in detail in FIG. 5B, and mathematically in the remainder of this disclosure.


Inverse Dynamics Model of Underactuated Systems Under the Rigid Body Assumption

Consider a mechanical system with n-DOF and let q∈custom-character be the vector of generalized coordinates. The system may have m control inputs (where m<n), each of which actuates a single DOF. The vector q∈custom-character may be partitioned as qT=[q1T, q2T], where q1custom-character and q2custom-character refer respectively to the actuated and the non-actuated DOFs. Under the rigid body assumption, the inverse dynamics of the underactuated system can be derived from the Euler-Lagrange equations as:









(
1
)














[





M

1

1


(
q
)





M

1

2


(
q
)







M

2

1


(
q
)





M

2

2


(
q
)




]




m

(

q
,

q
¨


)






[





q
¨

1







q
¨

2




]




c

(

q
,

q
.


)



+




[





c
1

(

q
,

q
.


)







c
2



(

q
,

q
.


)





]




g

(
q
)






[





g
1

(
q
)







g
2

(
q
)




]



τ



=

[




τ
1





0



]





(
1
)








where






M

(
q
)

=

[





M

1

1


(
q
)





M

1

2


(
q
)







M

2

1


(
q
)





M

2

2


(
q
)




]







    • is the symmetric, positive definite inertia matrix, m(q, {umlaut over (q)}) represents the inertial torque, c(q, {dot over (q)}) accounts for the Coriolis and centripetal torques, g(q) represents the gravity contribution while τ1custom-character is the vector of generalized torques produced by the m actuators.





The inverse dynamics identification problem consists of estimating the map in eq. (1) that relates {tilde over (x)}=(q, {dot over (q)}, {umlaut over (q)}) and the torques τ from a set of noisy measures. Black-box solutions treat the inverse dynamics as an unknown function and, generally, rely on universal approximators to estimate the function from experimental data. According to some embodiments, GPR, which is a framework for Bayesian inference widely used in machine learning and robotics, may be adopted in this regard.


The standard approach when using GPR consists of considering the different torque components independently and solving n independent regression problems, one for each generalized coordinate. However, with underactuated systems, torques of the under-actuated dimensions are constant signals equal to zero, leading to an ill-posed estimation problem. This black-box setup prevents the possibility of deriving any inverse dynamics model useful for model-based control strategies.


Model-Based Balancing Control of Underactuated Systems

A particular class of robots described by eq. (1), are known as balancing systems. Common examples of balancing systems are the Cartpole, the Furuta Pendulum, the Acrobot, and the Pendubot. Within such systems, the typical control challenge requires swinging up and balancing the robot in the unstable equilibrium point, hereafter denoted by x=[qT, {dot over (q)}T]T with {dot over (q)}=0.


Energy-Based Swing-Up Controller

The first step consists of a partial feedback linearization. From eq. (1) the dynamics of the actuated and non-actuated subsystems are isolated respectively as












M

1

1





q
¨

1


+


M

1

2





q
¨

2


+

c
1

+

g
1


=

τ
1





(
2
)















M

2

1





q
¨

1


+


M

2

2





q
¨

2


+

c
2

+

g
2


=
0




(
3
)







Eq. (3) may be solved for {umlaut over (q)}2 as {umlaut over (q)}2=−M22−1(M21{umlaut over (q)}1+c2+g2) since M22 is invertible (given that M>0). Substituting the resulting expression into eq. (2) leads to













M
¯


1

1





q
¨

1


+


c
¯

1

+


g
¯

1


=

τ
1





(
4
)








with








M
¯


1

1


=


M

1

1


-


M

1

2




M

2

2


-
1




M

2

1





,








c
¯

1

=


c
1

-


M

1

2




M

2

2


-
1




c
2








and







g
¯

1

=


g
1

-


M

1

2




M

2

2


-
1





g
2

.







A feedback linearizing controller for eq. (4) can be defined as










τ
1

=




M
¯


1

1



u

+


c
¯

1

+


g
¯

1






(
5
)









    • where u is a design parameter. Choosing τ1 as in eq. (5) leads to the following closed-loop system:














q
¨

1

=
u




(
6
)















M

2

2





q
¨

2


+

c
2

+

g
2


=


-

M

2

1




u





(
7
)







A linear second-order dynamics for the actuated subsystem may be obtained. Selecting u according to










u
=



-

k
1




q
1


-


k
2




q
˙

1


+

u
¯



,




(
8
)










k
1

,


k
2

>
0







    • makes the linear subsystem in eq. (6) asymptotically stable for ū=0. The remaining design problem is the choice of ū, which can be used to stabilize the non-actuated dynamics in eq. (7). According to some embodiments, ū may be designed based on energy concepts, penalizing the mismatch with respect to the system energy at the desired configuration. This leads to control laws of the form













u
¯

=


f
e

(

𝒯
,
𝒱

)





(
9
)









    • where










𝒯

(

q
,

q
˙


)

=


1
2




q
˙

T



M

(
q
)



q
˙






is the kinetic energy, while custom-character(q) denotes the potential energy. However, the choice of fe depends on the system of interest.


It may be noted that the controller presented in this section is not stabilizing the system to a fixed point but only to a manifold. For this reason, in applications such as the swing up of balancing robots, the control must switch to another controller achieving local asymptotic stability to the equilibrium.


Linear Quadratic Regulator

To stabilize the system at the equilibrium x we resort to a Linear Quadratic Regulator (LQR). First, a state space description for the system may be provided with dynamics expressed in eq. (1). Let the system state x be such that xT=[qT, {dot over (q)}T]. The state evolution can be derived from eq. (1) as










x
˙

=



[




q
.






-



M

-
1


(
q
)

[


c

(

q
,

q
.


)

+

g

(
q
)


]





]

+


[




0
n







M

-
1


(
q
)




]



τ


=


f

(
x
)

+

g

(

x
,
τ

)







(
10
)







Then the non-linear system in eq. (10) may be linearized around xT. Moreover, let τ be the reference input at the equilibrium. Applying a first-order Taylor expansion, the system dynamics around (x, τ) can be approximated as










x
˙

=


A

(

x
-

x



)

+

B

(

τ
-

τ



)






(
11
)







Recalling that at the equilibrium {dot over (q)}=0 matrices A and B are









A
=






f
T




x






x


,

τ





=





[




0


I






-

M

-
1





(
q
)






g

(
q
)




q








M

-
1


(
q
)



C

(

q
,

q
.


)





]





x


,

τ





=

[




0


I






-

M

-
1





(

q


)






g

(

q


)




q





0



]








(
12
)








and








B
=






g
T




τ






x


,

τ





=

[



0






M

-
1


(

q


)




]







(
13
)










    • where C(q, {dot over (q)})∈custom-character is the skew-symmetric Coriolis matrix, such that c(q, {dot over (q)})=C(q, {dot over (q)}){dot over (q)}.





Then, the infinite horizon control problem on the linearized system is, namely,









arg

min

τ






0








(

x
-

x



)

T



Q

(

x
-

x



)




+



(

τ
-

τ



)

T




R

(

τ
-

τ



)

T


dt







    • which leads to the control input τ=−Kx, with K being the optimal control gain, which can be computed by solving the continuous time algebraic Riccati equation.





Lagrangian Inspired Polynomial Estimator for Modeling and Control of Under-Actuated Systems

The LIP estimator is based on GP, which is a framework for Bayesian inference widely used in machine learning and robotics applications. Generally, GPR solutions for inverse dynamics identification model each torque component τi({tilde over (x)}) as a Gaussian Process (GP) by assuming τi({tilde over (x)})s are independent given {tilde over (x)}, and then apply standard GPR inference. As discussed in the previous section, this approach is not effective for underactuated systems. The LIP estimator follows an alternative strategy and defines the kinetic and potential energies as two independent GPs. Then, it derives a multi-output kernel of the torques by exploiting EL equations. In this way, the inverse dynamics problem is well defined also in the underactuated setup.


Some embodiments model custom-character and custom-character as independent GPs, namely custom-character˜custom-character(0, custom-character(⋅,⋅)) and custom-character˜custom-character(0, custom-character(⋅,⋅)), where custom-character and custom-character are the kernels functions that defines the covariance of custom-character and custom-character. For instance, let {tilde over (x)} and {tilde over (x)}′ be two input locations, then the covariance between the values of custom-character at {tilde over (x)} and {tilde over (x)}′ is E[custom-character({tilde over (x)}), custom-character({tilde over (x)}′)]=custom-character({tilde over (x)}, {tilde over (x)}′). For convenience, custom-character, that is the matrix that collects custom-character evaluated at X={{tilde over (x)}1, . . . , {tilde over (x)}N}, X′={x′1, . . . , x′M} is given as:







K

XX





𝒯

=


[





k
𝒯

(



x
~

1

,


x
~

1



)








k
𝒯



(



x
~

1

,


x
~

M



)



















k
𝒯



(



x
~

N

,


x
~

1



)









k
𝒯



(



x
~

N

,


x
~

M



)





]

.





Similarly,







K

XX





𝒱

=

[





k
𝒱

(



x
~

1

,


x
~

1



)








k
𝒱

(



x
~

1

,


x
~

M



)


















k
𝒱

(



x
~

N

,


x
~

1



)








k
𝒱

(



x
~

N

,


x
~

M



)




]





The LIP estimator defines custom-character and custom-character relying on a polynomial formulation as described previously with reference to FIG. 3.


It may be noted that, (i) since custom-character and custom-character are defined as zero-mean GPS, for the properties of GPs also the Lagrangian function custom-character=custom-charactercustom-character is a zero-mean GP with kernel custom-character(⋅,⋅)=custom-character(⋅,⋅)+custom-character(⋅,⋅), namely custom-character˜custom-character(0, custom-character(⋅,⋅)). Furthermore, (ii) under rigid body assumptions, each τi({tilde over (x)}) is described by a linear differential equation of custom-character, namely,







τ
i

=



d
dt



(









q
˙

i



)


-









q
i



.






Expanding explicit derivations with respect to time, gives:








τ
i

=






j
=
1

n


(






2








q
˙

i







q
˙

j







q
¨

j


+





2








q
˙

i






q
j







q
˙

j



)


-








q
i




=

:


𝒢
i






,






    • where custom-characteri is the linear operator that maps custom-character in the linear differential equation of τi. Finally, (iii) GPs are closed with respect to linear operators, namely, if f is a zero-mean GP with kernel kf(⋅,⋅) and g({tilde over (x)})=custom-characterf({tilde over (x)}), then also g({tilde over (x)}) is a zero-mean GP with kernel kg({tilde over (x)}, {tilde over (x)}′)=custom-characterkf({tilde over (x)}, {tilde over (x)}′). The last expression means that kg({tilde over (x)}, {tilde over (x)}′) is obtained by applying two times the operator custom-character to kf({tilde over (x)}, {tilde over (x)}′), first with respect to the input {tilde over (x)} then w.r.t. {tilde over (x)}′.





Based on (i), (ii), and (iii), some embodiments realize that torques are a zero-mean GP, with covariance defined by a multi-output kernel kτ({tilde over (x)}, {tilde over (x)}′)∈custom-character that encodes the EL equations, and is expressed as











k
τ

(


x
~

,


x
~




)

=

[





𝒢
1



𝒢
1





k


(


x
~

,


x
~




)









𝒢
1



𝒢
n




k




(


x
~

,


x
~




)



















𝒢
n



𝒢
1




k




(


x
~

,


x
~




)









𝒢
n



𝒢
n




k




(


x
~

,


x
~




)





]





(
14
)







To derive (14) the multi-output version of property (iii) may be applied. This is also described previously with reference to FIG. 3.


Once kτ is defined, torque estimates may be computed following standard GPR. Let X be a set of N training input locations, and y=[y1T, . . . , yNT]T the respective torque measurements, with yicustom-character equal to the torque measures at input {tilde over (x)}i. The LIP torque estimate in a general input location {tilde over (x)} is











τ
^

(

x
˜

)

=




K


x
~


X

τ

(


K
XX
τ

+




e



)


-
1



y





(
15
)









    • where Σe is a regularization parameter that accounts for the additive Gaussian noise modeled by GPR. For each dimension, independent and identically distributed noise may be assumed, thus obtaining a block diagonal matrix with equal diagonal blocks, namely Σe=diag(Σe1, . . . , ΣeN), with Σe1e2= . . . =ΣeN=diag(σe12, . . . , σen2) where σei2 is the variance of the τi measures.





LIP for Control

Next the estimation of the kinetic and potential energies, the inertial, Coriolis, and gravity vector as well as δg/δq required by the control laws presented with reference to eq. 9 is described.


Kinetic and Potential Energies


FIG. 5A illustrates a block diagram of a method 500A for estimation of the kinetic energy, according to some embodiments. FIG. 5B illustrates a block diagram of a method 500B for estimation of the potential energy, according to some embodiments.


The kinetic energy custom-character and potential energy custom-character are required to implement the energy based control law described by eq. (9). The LIP model provides a principled way to estimate them from the torque measurements y. Indeed, within the LIP framework, custom-character, custom-character, and τ are jointly Gaussian distributed, since the prior of τ is derived by applying the linear operator custom-character to the kinetic and potential GPs custom-character and custom-character. The covariances between custom-character and τ and between custom-character and τ at general input locations {tilde over (x)} and {tilde over (x)}′ are











Cov
[


𝒯

(

x
˜

)

,

τ

(


x
˜



)


]

=


Cov
[


𝒯

(

x
˜

)

,

𝒢




(


x
˜



)



]

=

:



k

𝒯
τ


(


x
˜

,


x
˜




)




,




(

16

a

)














Cov
[


𝒱



(

x
˜

)


,

τ

(


x
˜



)


]

=


Cov
[


𝒱



(

x
˜

)


,

𝒢




(


x
˜



)



]

=

:



k

𝒱
τ


(


x
˜

,


x
˜




)




,




(

16

b

)







Recalling that custom-character and custom-character are modelled as independent GPs, and in view of the properties of GPs under linear operators,











k

𝒯
τ


(


x
˜

,


x
˜




)

=


Cov
[


𝒯

(

x
˜

)

,

𝒢𝒯

(


x
~



)


]

=



[


𝒢





k
𝒯

(


x
˜

,


x
˜




)


]

T

=

[



𝒢
1





k
𝒯

(


x
˜

,


x
˜




)


,


,


𝒢
n





k
𝒯

(


x
˜

,


x
˜




)



]







(

17

a

)














k

𝒱
τ


(


x
˜

,


x
˜




)

=


Cov
[


𝒱

(

x
˜

)

,

𝒢𝒱

(


x
~



)


]

=



[


𝒢





k
𝒱

(


x
˜

,


x
˜




)


]

T

=

[



𝒢
1





k
𝒱

(


x
˜

,


x
˜




)


,


,


𝒢
n





k
𝒱

(


x
˜

,


x
˜




)



]







(

17

b

)









    • may be obtained.





The Gaussian properties make the posterior distributions of custom-character and custom-character given y known analytically. At any general input location x, these posteriors are Gaussians distributions, therefore exactly defined by mean and variance. The means are computed as











E
[


𝒯

(
x
)

|
𝒟

]

=




K
xX
𝒯τ

(


K
XX

+




e



)


-
1



y


,




(

18

a

)














E
[


𝒱

(
x
)

|
𝒟

]

=




K
xX
𝒱τ

(


K
XX

+




e



)


-
1



y


,




(

18

b

)









    • for custom-character and custom-character, respectively. The variances are computed as












a
)











𝕍
[

𝒯

(
x
)

]

=



k
𝒯

(

x
,
x

)

-




K
xX
𝒯τ

(


K
XX

+




e



)


-
1





(

K
xX
𝒯τ

)

T




,




(

19

a

)












b
)











𝕍
[

𝒱

(
x
)

]

=



k
𝒱

(

x
,
x

)

-




K
xX
𝒱τ

(


K
XX

+




e



)


-
1





(

K
xX
𝒱τ

)

T




,




(

19

b

)









    • for custom-character and custom-character, respectively.





From the posterior distributions of custom-character and custom-character given y an estimate of the energies at arbitrary input location {tilde over (x)} may be obtained as












𝒯
^

(

x
˜

)

=




K


x
~


X

𝒯τ

(


K
XX
τ

+




e



)


-
1



y


,




(

20

a

)















𝒱
^

(

x
˜

)

=




K


x
~


X

𝒱τ

(


K
XX
τ

+




e



)


-
1



y


,




(

20

b

)









    • where the covariance matrices custom-charactercustom-character and Kcustom-charactercustom-character are obtained as










K


x
~


X

𝒯τ

=

[



k

𝒯

τ


(


x
˜

,

x
1


)

,


,


k

𝒯

τ


(


x
˜

,

x
N


)


]






and






K


x
~


X

𝒱τ

=


[



k

𝒱

τ


(


x
˜

,

x
1


)

,


,


k

𝒱

τ


(


x
˜

,

x
N


)


]

.





Referring to FIG. 5A, at 501 of the method 500A, the covariance between the kinetic energy custom-character and the torques τ is computed as per eq. 17a above. At 503, a probabilistic posterior distribution of the kinetic energy is computed as per eq. 18a and 19a above. At block 505, the kinetic energy is estimated from the probabilistic posterior distribution of the kinetic energy based on the covariance between the kinetic energy custom-character and the torques τ, as







E
[



(
x
)


|
𝒟

]

=



(


K
XX

+





e


)


-
1




y
.






Referring to FIG. 5B, at 507 of the method 500B, the covariance between the potential energy custom-character and the torques τ is computed as per eq. 17b above. At 509, a probabilistic posterior distribution of the potential energy is computed as per eq. 18b and 19b above. At block 511, the potential energy is estimated from the probabilistic posterior distribution of the kinetic energy based on the covariance between the kinetic energy custom-character and the torques τ, as







E
[



(
x
)


|
𝒟

]

=



(


K
XX

+





e


)


-
1




y
.






Torque Components

The energy-based control law in eq. (5) requires estimating m, c and g, while the LQR described previously requires the inverse of the inertia matrix M as well as the term δg/δq. These quantities are derived similarly to how the energies are computed as described in the previous section.


First, the inertia matrix is estimated component wise. The element in position ij of M is









M
ij

(
q
)

=






z




(

q
,
q

)







q
.

i







q
.

j




=

:


M
ij



(

x
~

)




,






    • where we introduced the linear operator custom-characterMij. The covariance between Mij and τ at general input locations {tilde over (x)} and {tilde over (x)}′ can be computed as










Cov
[



M
ij

(
x
)

,

τ

(


x
~



)


]

=


Cov
[




M
ij



(

x
~

)


,


(


x
~



)



]

=




M
ij



(


x
~

,


x
~




)


=

:




k


M
ij


τ


(


x
~

,


x
~




)

.








Accordingly, Mij may be estimated at any input location {tilde over (x)} as








M
ij

(

x
~

)

=




K

M

ij
τ



(


K
XX
τ

+





e


)


-
1



y





with







K


M
ij


τ


=


[



k

M

ij
τ



(


x
~

,


x
~

1


)

,


,


k

M

ij
τ



(


x
~

,


x
~

N


)


]

.





Then, from the estimate of the inertia matrix an estimate of the inertial torque component may be derived as {circumflex over (m)}({tilde over (x)})={circumflex over (M)}({tilde over (x)}){umlaut over (q)}.


Next, the gravity contribution g may be estimated. Recall that the i-th component of the vector g is defined as








g
i

(
q
)

=








(
q
)





q
i



.





The covariance between gi and τ at general input locations {tilde over (x)} and {tilde over (x)}′ is







Cov
[



g
i

(
x
)

,

τ

(


x
~



)


]

=


Cov
[








(
x
)





q
i



,


(


x
~



)



]

=









k
v

(


x
~

,


x
~




)





q
i



=

:



k


g
i


τ


(


x
~

,


x
~




)

.








Accordingly gi can be estimated at any input location {tilde over (x)} as









g
^

(

x
~

)

=


K


x
~


X



g
i


τ


(



(


K
XX
τ

+





e


)


-
1



y

)


,




with







K


x
~


X



g
i


τ


=


[



k


g
i


τ


(


x
~

,


x
~

1


)

,


,


k


g
i


τ


(


x
~

,


x
~

N


)


]

.





Given the estimates of m and g, it is possible to obtain an estimate of the Coriolis and centripetal contribution c as ĉ({tilde over (x)})={circumflex over (τ)}({tilde over (x)})−m({tilde over (x)})−g({tilde over (x)}).


Finally, the matrix









g



q






n
×
n






required in the computation of matrix A in eq. (12), is estimated following the same procedure adopted for the inertia matrix. Its element in position ij is given by.








[




g

(
q
)




q


]

ij

=





2


v

(
q
)






q
i






q
j




=


:


G
ij



(
q
)


=

:



G
ij

(

x
~

)

.








Then, the covariance between Gij({tilde over (x)}) and the torque vector τ at general input locations {tilde over (x)} and {tilde over (x)}′ is







Cov
[



G
ij

(

x
~

)

,

τ

(


x
~



)


]

=


Cov
[




G
ij



(
q
)


,


(


x
~



)



]

=




G
ij



(


x
~

,


x
~




)


=

:



k


G
ij


τ


(


x
~

,


x
~




)

.








Finally, the estimate at a general input location {tilde over (x)} is computed as










G
^

ij

(

x
~

)

=




K


x
~


X


G

ij
τ



(


K
XX
τ

+





e


)


-
1



y


,




with







K


x
~


X


G

ij
τ



=


[



k

G

ij
τ



(


x
~

,


x
~

1


)

,


,


k

G

ij
τ



(


x
~

,


x
~

N


)


]

.






FIG. 6 illustrates a block diagram of a system 600 for controlling an underactuated mechanical system 602, according to some embodiments. The system 600 comprises an energy-based controller 604 that generates control commands 601 to control the underactuated mechanical system 606 to track a trajectory 603. In this regard, the controller 604 interfaces with a trained energy-based inverse dynamics model 606 such as the model 107 of FIG. 1A and 4. The model 606 takes the trajectory 603 and current states 609 as input to produce values of energies 605 and dynamic components 607 of the underactuated mechanical system 602. According to some embodiments, the model 606 outputs values of the kinetic and potential energies of the system 602 and one or more dynamic components such as the inertia, the inertial torques, Coriolis and centripetal torques, gravity contributions etc.


Some embodiments are based on the realization that the estimated kinetic energy and the potential energy can be used to detect anomaly of the underactuated mechanical system 602 during operation of the mechanical system 602 (e.g., while performing the task). The anomaly detection based on the estimated kinetic energy and the potential energy of the system 602 is explained next.



FIG. 7 illustrates a flow diagram of a method 700 for an anomaly detection based on the estimated kinetic energy and the estimated potential energy of the underactuated mechanical system 602, according to some embodiments. At block 701, the kinetic energy and the potential energy of the underactuated mechanical system 602, are estimated for example in accordance with the framework described above with reference to FIG. 5A and FIG. 5B.


Further, the method 700 comprises comparing 703 the estimated kinetic energy and the estimated potential energy with a respective threshold. Towards this end, the estimated kinetic energy may be compared with a first threshold and the estimated potential energy may be compared with a second threshold. Based on such comparison, the anomaly is detected 705. For instance, at block 703, it may be checked if the estimated kinetic energy and the estimated potential energy are greater than the respective threshold. If the estimated kinetic energy and the estimated potential energy are greater than the first threshold and the second threshold, respectively, then, at block 705, it is inferred that the anomaly is detected. If the estimated kinetic energy and the estimated potential energy are not greater than the first threshold and the second threshold, respectively, then, at block 707, it is inferred that no anomaly is detected.


Alternatively, in some embodiments, it may be checked if the estimated kinetic energy and the estimated potential energy are less than the first threshold and the second threshold, respectively. If the estimated kinetic energy and the estimated potential energy are less than the first threshold and the second threshold, respectively, then it is inferred that the anomaly is detected. If the estimated kinetic energy and the estimated potential energy are not less than the first threshold and the second threshold, respectively, then it is inferred that no anomaly is detected. If the estimated kinetic energy is greater than the first threshold and the estimated potential energy is less than the second threshold, then a fault detection is inferred as a special case of anomaly detection depending only on potential energy faults. If the estimated potential energy is greater than the second threshold and the estimated kinetic energy is less than the first threshold, then a fault detection is inferred as a special case of anomaly detection depending only on kinetic energy faults.


Additionally, in some embodiments, the estimated potential energy and the estimated kinetic energy may be used to adjust the control commands for the underactuated mechanical system 602. For instance, based on the estimated potential energy and the estimated kinetic energy, a passivity controller may be designed to control the underactuated mechanical system 602. The passivity controllers are effective to control mechanical systems but require precise models of the energies to define Hamiltonian or Lagrangian of the system. Some embodiments can estimate accurate models of the energies that result in accurate passivity controllers.


Additionally, in an embodiment, a processor may be configured to determine, based on the estimated kinetic energy and the estimated potential energy, a motion plan that consumes a minimum amount of energy for performing the task. The motion plan may include a trajectory for performing the task. Such an embodiment is described below in FIG. 8.



FIG. 8 illustrates a motion plan 809 that consumes a minimum amount of energy for performing the task by the robotic manipulator 800, according to some embodiments. The motion plan 809 is a trajectory to be tracked by the robotic manipulator 800 for performing the task. It is an object of some embodiments to configure the robotic manipulator 800 to perform a task of inserting an object 801 in a hole 803. A motion plan may be determined for the robotic manipulator 800 to perform the task. Different motion plans, such as a motion plan 805 and a motion plan 807, may be determined for performing the task. However, it is desired to determine a motion plan that consumes the minimum amount of energy for performing the task. The processor 103 is configured to determine, based on the estimated kinetic energy and the estimated potential energy, the motion plan 809 that consumes the minimum amount of energy for performing the task. Further, the processor 103 is configured to control the robotic manipulator 800 according to the motion plan 809 to perform the task by consuming the minimum amount of energy. For example, based on the motion plan 809, the processor 103 is configured to determine control commands for the different actuators of the robotic manipulator 800. The control commands are applied to the different actuators of the robotic manipulator 800. The control commands cause the robotic manipulator 800 to track the motion plan 809 for performing the task.



FIG. 9 is a schematic illustrating a computing device 900 for implementing the controller 101 and methods of the present disclosure. The computing device 900 includes a power source 901, a processor 903, a memory 905, a storage device 907, all connected to a bus 909. Further, a high-speed interface 911, a low-speed interface 913, high-speed expansion ports 915 and low speed connection ports 917, can be connected to the bus 909. In addition, a low-speed expansion port 919 is in connection with the bus 909. Further, an input interface 921 can be connected via the bus 909 to an external receiver 923 and an output interface 925. A receiver 927 can be connected to an external transmitter 929 and a transmitter 931 via the bus 909. Also connected to the bus 909 can be an external memory 933, external sensors 935, machine(s) 937, and an environment 939. Further, one or more external input/output devices 941 can be connected to the bus 909. A network interface controller (NIC) 943 can be adapted to connect through the bus 909 to a network 945, wherein data or other data, among other things, can be rendered on a third-party display device, third party imaging device, and/or third-party printing device outside of the computer device 900.


The memory 905 can store instructions that are executable by the computer device 900 and any data that can be utilized by the methods and systems of the present disclosure. The memory 905 can include random access memory (RAM), read only memory (ROM), flash memory, or any other suitable memory systems. The memory 905 can be a volatile memory unit or units, and/or a non-volatile memory unit or units. The memory 905 may also be another form of computer-readable medium, such as a magnetic or optical disk.


The storage device 907 can be adapted to store supplementary data and/or software modules used by the computer device 900. The storage device 907 can include a hard drive, an optical drive, a thumb-drive, an array of drives, or any combinations thereof. Further, the storage device 907 can contain a computer-readable medium, such as a floppy disk device, a hard disk device, an optical disk device, or a tape device, a flash memory or other similar solid-state memory device, or an array of devices, including devices in a storage area network or other configurations. Instructions can be stored in an information carrier. The instructions, when executed by one or more processing devices (for example, the processor 903), perform one or more methods, such as those described above.


The computing device 900 can be linked through the bus 909, optionally, to a display interface or user Interface (HMI) 947 adapted to connect the computing device 900 to a display device 949 and a keyboard 951, wherein the display device 949 can include a computer monitor, camera, television, projector, or mobile device, among others. In some implementations, the computer device 900 may include a printer interface to connect to a printing device, wherein the printing device can include a liquid inkjet printer, solid ink printer, large-scale commercial printer, thermal printer, UV printer, or dye-sublimation printer, among others.


The high-speed interface 911 manages bandwidth-intensive operations for the computing device 900, while the low-speed interface 913 manages lower bandwidth-intensive operations. Such an allocation of functions is only an example. In some implementations, the high-speed interface 911 can be coupled to the memory 905, the user interface (HMI) 947, and to the keyboard 951 and the display 949 (e.g., through a graphics processor or accelerator), and to the high-speed expansion ports 915, which may accept various expansion cards via the bus 909. In an implementation, the low-speed interface 913 is coupled to the storage device 907 and the low-speed expansion ports 917, via the bus 909. The low-speed expansion ports 917, which may include various communication ports (e.g., USB, Bluetooth, Ethernet, wireless Ethernet) may be coupled to the one or more input/output devices 941. The computing device 900 may be connected to a server 953 and a rack server 955. The computing device 900 may be implemented in several different forms. For example, the computing device 900 may be implemented as part of the rack server 955.


In accordance with several embodiments, the controller 101 is configured to control the mechanical system 109 using the inverse dynamics model 107. The inverse dynamics model 107 models the energy of the mechanical system 109. Modeling the energy captures mutual effects of the torques of the different actuators on each other. Thereby, the inverse dynamics model 107 enables accurate controlling of the mechanical system 101. Additionally, the formulation of the inverse dynamics model 107 requires minimum physical information about the mechanical system 109. To that end, the formulation of the inverse dynamics model 107 is computationally inexpensive. Additionally, or alternatively, the inverse dynamics model 107 can be used to estimate the kinetic energy of the mechanical system 109 and the potential energy of the mechanical system 109.


The above description provides exemplary embodiments only, and is not intended to limit the scope, applicability, or configuration of the disclosure. Rather, the following description of the exemplary embodiments will provide those skilled in the art with an enabling description for implementing one or more exemplary embodiments. Contemplated are various changes that may be made in the function and arrangement of elements without departing from the spirit and scope of the subject matter disclosed as set forth in the appended claims.


Specific details are given in the following description to provide a thorough understanding of the embodiments. However, understood by one of ordinary skill in the art can be that the embodiments may be practiced without these specific details. For example, systems, processes, and other elements in the subject matter disclosed may be shown as components in block diagram form in order not to obscure the embodiments in unnecessary detail. In other instances, well-known processes, structures, and techniques may be shown without unnecessary detail in order to avoid obscuring the embodiments. Further, like reference numbers and designations in the various drawings indicated like elements. Also, individual embodiments may be described as a process which is depicted as a flowchart, a flow diagram, a data flow diagram, a structure diagram, or a block diagram. Although a flowchart may describe the operations as a sequential process, many of the operations can be performed in parallel or concurrently. In addition, the order of the operations may be re-arranged. A process may be terminated when its operations are completed but may have additional steps not discussed or included in a figure. Furthermore, not all operations in any particularly described process may occur in all embodiments. A process may correspond to a method, a function, a procedure, a subroutine, a subprogram, etc. When a process corresponds to a function, the function's termination can correspond to a return of the function to the calling function or the main function.


Furthermore, embodiments of the subject matter disclosed may be implemented, at least in part, either manually or automatically. Manual or automatic implementations may be executed, or at least assisted, through the use of machines, hardware, software, firmware, middleware, microcode, hardware description languages, or any combination thereof. When implemented in software, firmware, middleware or microcode, the program code or code segments to perform the necessary tasks may be stored in a machine-readable medium. A processor(s) may perform the necessary tasks. Various methods or processes outlined herein may be coded as software that is executable on one or more processors that employ any one of a variety of operating systems or platforms. Additionally, such software may be written using any of a number of suitable programming languages and/or programming or scripting tools, and also may be compiled as executable machine language code or intermediate code that is executed on a framework or virtual machine. Typically, the functionality of the program modules may be combined or distributed as desired in various embodiments.


Embodiments of the present disclosure may be embodied as a method, of which an example has been provided. The acts performed as part of the method may be ordered in any suitable way. Accordingly, embodiments may be constructed in which acts are performed in an order different than illustrated, which may include performing some acts concurrently, even though shown as sequential acts in illustrative embodiments. Further, use of ordinal terms such as “first,” “second,” in the claims to modify a claim element does not by itself connote any priority, precedence, or order of one claim element over another or the temporal order in which acts of a method are performed, but are used merely as labels to distinguish one claim element having a certain name from another element having a same name (but for use of the ordinal term) to distinguish the claim elements. Although the present disclosure has been described with reference to certain preferred embodiments, it is to be understood that various other adaptations and modifications can be made within the spirit and scope of the present disclosure. Therefore, it is the aspect of the appended claims to cover all such variations and modifications as come within the true spirit and scope of the present disclosure.

Claims
  • 1. A feedback controller for controlling a mechanical system, the mechanical system having multiple degrees of freedom and comprising a plurality of actuators and a plurality of joints, the feedback controller comprising: a memory configured to store an energy-based inverse dynamics model trained with machine learning to map dynamic states of the mechanical system to corresponding torques for the plurality of actuators, wherein the energy-based inverse dynamics model is configured to model potential and kinetic energy of the mechanical system as Gaussian Processes of the dynamic states and derive Gaussian Processes for the torques from the Gaussian Processes of the dynamic states based on physics of relationship between the torques and the potential and kinetic energy of the mechanical system; anda processor configured to: collect a feedback signal of an operation of the mechanical system, the feedback signal including current states of dynamics of the mechanical system indicative of a position, a velocity, and an acceleration of each joint of the plurality of joints of the mechanical system;process the current states of dynamics with the energy-based inverse dynamics model to produce values of the torques for the plurality of actuators and values of the potential and kinetic energy of the mechanical system; andcontrol the mechanical system based on the produced values of the torques for the plurality of actuators of the mechanical system and the values of the potential and kinetic energy of the mechanical system.
  • 2. The feedback controller of claim 1, wherein the energy-based inverse dynamics model is configured to model the potential energy and the kinetic energy of the mechanical system as independent Gaussian Processes of the dynamic states of the mechanical system,wherein each of the Gaussian Processes for the potential energy and the kinetic energy of the mechanical system is defined based on a different kernel function, andwherein the kernel function for the potential energy of the mechanical system combined with a physics based first operator define a covariance of the potential energy of the mechanical system with respect to the torques of the actuators and the kernel function for the kinetic energy of the mechanical system combined with a physics based second operator define a covariance of the kinetic energy of the mechanical system with respect to the torques of the actuators.
  • 3. The feedback controller of claim 2, wherein the physics based first operator and the physics based second operator are given by a respective set of linear differential equations that relate a Lagrangian function of the mechanical system to the torques accordingly to first principles.
  • 4. The feedback controller of claim 1, wherein each of the Gaussian Processes for the potential energy and kinetic energy of the mechanical system is a zero-mean Gaussian Process.
  • 5. The feedback controller of claim 1, wherein the mechanical system is configured to perform a task or track a reference trajectory for performing the task,wherein the reference trajectory defines positions of joints of the mechanical system as a function of time, andwherein the processor is further configured to: produce values of the potential energy of the mechanical system at each position of each joint of the plurality of joints as a first function of a covariance between the potential energy of the mechanical system and the produced value of torque of a corresponding actuator of the plurality of actuators of the mechanical system at respective positions of the reference trajectory; andproduce values of the kinetic energy of the mechanical system at each position of each joint of the plurality of joints as a second function of a covariance between the kinetic energy and the produced value of torque of a corresponding actuator of the plurality of actuators of the mechanical system at respective positions of the reference trajectory.
  • 6. The feedback controller of claim 1, wherein the Gaussian Processes of the torques has a covariance matrix capturing correlations between the torques of the plurality of actuators, and wherein the covariance matrix is a full matrix including non-zero elements.
  • 7. The feedback controller of claim 1, wherein to control the mechanical system, the processor is configured to: compute separately an inertia matrix, an inertial torque component and one or more remaining torque components of the mechanical system from the estimated kinetic and potential energy of the mechanical system; anddetermine control commands to the actuators of the mechanical system based on the produced values of the torques and the inertia matrix, the inertial torque component and the one or more remaining torque components of the mechanical system.
  • 8. The feedback controller of claim 7, wherein the inertia matrix, the inertial torque component and the one or more remaining torque components of the mechanical system are composed in a classical feedback linearization controller.
  • 9. The feedback controller of claim 1, wherein the mechanical system is underactuated having more degrees of freedom than a number of the plurality of actuators, and wherein to control the mechanical system, the processor is configured to: compute an inertia matrix, an inertial torque component, a Coriolis torque component, a gravitational torque component, a derivative of the gravitational torque component with respect to the joint positions, a feedback linearizing controller offset, and matrices of a linear quadratic regulator of the mechanical system from the estimated kinetic and potential energy of the mechanical system; anddetermine control commands to the actuators of the mechanical system based on the produced values of the torques and one or more of the inertia matrix, the inertial torque component, the Coriolis torque component, the gravitational torque component, the derivative of the gravitational torque component with respect to the joint positions, the feedback linearizing controller offset, and the matrices of the linear quadratic regulator of the mechanical system.
  • 10. The feedback controller of claim 1, wherein the Gaussian Process for the torques is trained using supervised learning, wherein the torque measurements and the dynamic states measurements are measured with one more sensors during training phase.
  • 11. The feedback controller of claim 10, wherein the sensors comprise one or more of an encoder, a camera, a laser, a velocimeter, an accelerometer, an inertia measurement unit to measure the dynamic state and a force torque sensor, a string gauge, or a torque sensor.
  • 12. The feedback controller of claim 11, wherein the training of the Gaussian Processes utilizes random movements of the mechanical system, or an ad-hoc trajectory.
  • 13. The feedback controller of claim 1, wherein the energy based inverse dynamics model is defined by a Lagrangian polynomial kernel that is based on a Lagrangian operator mapping a Lagrangian function of the mechanical system to the torques of the plurality of actuators.
  • 14. The feedback controller of claim 13, wherein the Lagrangian function is defined based on a difference between the kinetic energy of the mechanical system and the potential energy of the mechanical system.
  • 15. The feedback controller of claim 13, wherein one or more hyperparameters of the Lagrangian polynomial kernel are learned based on a machine learning algorithm, the machine learning algorithm using maximization of marginal likelihood.
  • 16. The feedback controller of claim 1, wherein the processor is further configured to detect an anomaly of the mechanical system based on a comparison of the produced values of kinetic energy with a first threshold and the produced values of potential energy with a second threshold.
  • 17. The feedback controller of claim 1, wherein the processor is further configured to determine a motion plan that consumes a minimum amount of energy for performing the task, based on the produced values of kinetic energy and the produced values of potential energy.
  • 18. A computer-implemented method for controlling a mechanical system, the mechanical system having multiple degrees of freedom and comprising a plurality of actuators, the method comprising: accessing a memory to retrieve an energy-based inverse dynamics model trained with machine learning to map dynamic states of the mechanical system to corresponding torques for the plurality of actuators, wherein the energy-based inverse dynamics model is configured to model potential and kinetic energy of the mechanical system as Gaussian Processes of the dynamic states and derive Gaussian Processes for the torques from the Gaussian Processes of the dynamic states of the mechanical system based on physics of relationship between the torques and the potential and kinetic energy of the mechanical system;collecting a feedback signal of an operation of the mechanical system, the feedback signal including current states of dynamics of the mechanical system indicative of a position, a velocity, and an acceleration of each joint of the plurality of joints of the mechanical system;processing the current states of dynamics with the energy-based inverse dynamics model to produce values of the torques for the plurality of actuators and values of the potential and kinetic energy of the mechanical system; andcontrolling the mechanical system based on the produced values of the torques for the plurality of actuators of the mechanical system and the values of the potential and kinetic energy of the mechanical system.
CROSS REFERENCE TO RELATED APPLICATIONS

This application is a continuation-in-part of U.S. patent application Ser. No. 18/222,540 filed Jul. 17, 2023, which claims benefit of priority from provisional application Ser. No. 63/469,002 filed May 25, 2023, the contents of which are incorporated by reference herein.

Provisional Applications (1)
Number Date Country
63469002 May 2023 US
Continuation in Parts (1)
Number Date Country
Parent 18222540 Jul 2023 US
Child 18772131 US