The disclosure relates to an imitation learning method, as well as to a computer program for implementing such a method, a computer device programmed so as to implement this method, and a robotic system comprising such a computer device and a multi-axis manipulator.
In the present context, “imitation learning”, also known as “learning by demonstration” or “programming by demonstration”, refers to methods allowing a robotic system to learn a set of actions by having them performed by an operator, so as to replicate them. Such imitation learning methods may be applied in a large variety of fields including, for instance, industrial or medical robotics. They may not just be used to program a robotic system for later replication of the actions of the human operator, but also for remote operation purposes, where one or several remote multi-axis manipulators replicate the actions of the human operator in real time.
Imitation learning methods facilitate the programming of a robotic system, and in particular of a robotic system comprising at least one multi-axis manipulator, and this even by operators without particular programming skills. Instead, the manual dexterity of the programming manipulator becomes crucial in ensuring a smooth, efficient motion to be replicated by the robotic system.
Nevertheless, even the most skilled human operator may be unable to achieve the smoothness and accuracy that can be achieved by a robotic system. Exact replication of the actions of a human operator will thus limit the potential of the robotic system to improve on the dexterity of the human operator.
Consequently, a first object of the present disclosure is that of providing an imitation learning method whereby a robotic system can learn to perform a set of operation with even higher accuracy and efficiency than a human user whose operations are to be replicated.
Accordingly, in at least one illustrative embodiment, this imitation learning method may comprise at least the steps of:
The capture step provides the input of spatial data corresponding to the operation of the training tool by the user. However, thanks to the subsequent waypoint selection step, it is possible to filter, from the teach-in trajectory, small user hesitations and deviations, thus resulting in a smoother set trajectory on whose basis the motion commands for the individual joints of the multi-axis manipulator will then be obtained. Said motion commands may be transmitted to a multi-axis manipulator in real time, for the remote operation of said multi-axis manipulator through the user-operated training tool. Alternatively or complementarily to this transmission, however, these motion commands may be stored for subsequent input to a multi-axis manipulator.
If the multi-axis manipulator is not infinitely redundant in the Cartesian space of the set trajectory, the conversion of the set trajectory into motion commands in a joint space of the multi-axis manipulator may be performed using an inverse kinematic model of the multi-axis manipulator.
However, said multi-axis manipulator may alternatively be infinitely redundant in said Cartesian space, and said conversion step then comprise the calculation of an optimal path of redundant joint positions maximizing Yoshikawa index values for the multi-axis manipulator along the set trajectory. In this context, “redundant joint position” is understood as meaning a positional value in the joint space axis corresponding to a redundant joint. If the redundant joint is a rotating joint, this redundant joint position will have an angular value. By determining a position for each redundant joint, it is possible to solve the positions of the remaining joints. To each position vector of the multi-axis manipulator in joint space corresponds a Jacobian matrix which is the transformation matrix from joint speed vector to the speed vector of an end-effector of the multi-axis manipulator in Cartesian space. The Yoshikawa index is a manipulability index defined as the square root of the determinant of the product of this Jacobian matrix and its transverse. Maximizing the Yoshikawa index increases the accuracy of the multi-axis manipulator while reducing the joint speeds during its motion.
The calculation of said redundant joint trajectory may in particular comprise the steps of:
In order to ensure the quality of the optimal path, it may be subsequently validated using an accuracy index corresponding to a ratio of Cartesian space to joint space variation along said optimal path and/or an energy index corresponding to joint speeds in joint space along said optimal path.
The abovementioned spatial data comprising position and orientation of the training tool in the Cartesian space may be captured through an optical sensor and in particular a stereoscopic sensor, although other optical sensors suitable for capturing tridimensional positional data, such as for instance time-of-flight sensors, may alternatively be used.
In order to identify both position and orientation of the training tool with such an optical sensor, said user-operated training tool may carry at least a first marker and two additional markers spaced along different axes from said first marker. To ensure redundancy, so that both position and orientation of the learning can be identified even in low visibility conditions, a set of markers may be used comprising four markers of which no more than two are co-linear.
Alternatively to the use of an optical sensor, however, said user-operated training tool may be carried by a multi-axis manipulator, a manual operation of said user-operated training tool being servo-assisted by the multi-axis manipulator carrying the user-operated training tool, and said spatial data being captured through joint position sensors of the multi-axis manipulator carrying the user-operated training tool. For said servo-assistance, user force inputs may for instance be sensed by force sensors at the training tool and converted into joint actuator commands for the multi-axis manipulator carrying the user-operated training tool.
The disclosed imitation-learning method may in particular be computer-implemented. Consequently, the present disclosure also relates to a computer program for implementing such an imitation learning method, to a computer-readable data storage medium containing an instruction set for implementing such an imitation learning method, to a computing device programmed with an instruction set for carrying out such an imitation learning method, and to a robotic system comprising such a computing device and a multi-axis manipulator connected to it for its control.
The above summary of some example embodiments is not intended to describe each disclosed embodiment or every implementation of the invention. In particular, selected features of any illustrative embodiment within this specification may be incorporated into an additional embodiment unless clearly stated to the contrary.
The invention may be more completely understood in consideration of the following detailed description of various embodiments in connection with the accompanying drawings, in which:
While the invention is amenable to various modifications and alternative forms, specifics thereof have been shown by way of example in the drawings and will be described in detail. It should be understood, however, that the intention is not to limit aspects of the invention to the particular embodiments described. On the contrary, the intention is to cover all modifications, equivalents, and alternatives falling within the scope of the invention.
For the following defined terms, these definitions shall be applied, unless a different definition is given in the claims or elsewhere in this specification.
As used in this specification and the appended claims, the singular forms “a”, “an”, and “the” include plural referents unless the content clearly dictates otherwise. As used in this specification and the appended claims, the term “or” is generally employed in its sense including “and/or” unless the content clearly dictates otherwise.
The following detailed description should be read with reference to the drawings in which similar elements in different drawings are numbered the same. The detailed description and the drawings, which are not necessarily to scale, depict illustrative embodiments and are not intended to limit the scope of the invention. The illustrative embodiments depicted are intended only as exemplary. Selected features of any illustrative embodiment may be incorporated into an additional embodiment unless clearly stated to the contrary.
Imitation learning is known to be a useful and particularly user-friendly technique for programming complex operations in multi-axis manipulators.
Sensor 5 may in particular be an optical sensor, and more specifically a stereoscopic sensor, generating two laterally offset images whose parallax can then be used to infer depth data. However, various other types of sensors suitable for providing three-dimensional position data may be considered, such as for instance so-called time-of-flight sensors.
As shown in
While in the first embodiment illustrated in
While an optical sensor 5 is used in both the first and second illustrated embodiments, alternative arrangements may also be used to capture the position and orientation of a user-operated teaching tool 3. In the third embodiment illustrated in
In each embodiment, the computing device may be a conventional programmable computer running a computer program implementing these methods. This computer program may be in the shape of a set of instructions stored in a memory carrier. In the present context, “memory carrier” and “data storage medium” should be understood as meaning any physical medium capable of containing data readable by a reading device for at least a certain period of time. Examples of such memory carriers are magnetic tapes and discs, optical discs (read-only as well as recordable or re-writable), logical circuit memories, such as read-only memory chips, random-access memory chips and flash memory chips, and even more exotic data storage media, such as chemical, biochemical or mechanical memories.
Even a highly-skilled, highly-dexterous human operator will be unable to suppress some tremor and hesitation during his operation.
In a three-dimensional Cartesian space, a six-axis manipulator, such as those illustrated in FIGS. 1A,1B, 3 and 4, is finitely redundant, that is, offers only a finite number of solutions in joint space for a given end-effector position and orientation in the Cartesian space. Consequently, the step of converting a set trajectory for the end-effector in Cartesian space into motion commands in joint space can be carried out using an inverse kinematic model of the six-axis manipulator and well-known singularity avoidance algorithms, relying for instance on the Yoshikawa index, on singularity avoidance by angular velocity inputs, or on the damped least-squares method. With at least one additional joint, however, like the seven-axis manipulator 7′ illustrated in
A suitable indicator of the manipulability of a multi-axis manipulator is the Yoshikawa index p, defined by the equation:
μ=√{square root over (det(J·JT))}
wherein J is the Jacobian matrix of the multi-axis manipulator, that is, the matrix determining the relationship between end-effector velocities {dot over (X)} in the Cartesian space and joint velocities {dot over (q)} in joint space, according to the equation:
{dot over (X)}=J*{dot over (q)}
For example, with a seven-axis manipulator with seven serially arranged rotational joints, this equation can be expressed as:
wherein {dot over (x)}, {dot over (y)} and ż are linear speeds of the end-effector in three orthogonal axes in the Cartesian space, {dot over (α)}, {dot over (β)} and {dot over (γ)} are angular speeds of the end-effector around three orthogonal axes in the Cartesian space, and {dot over (θ)}1 to {dot over (θ)}7 are angular speeds of each one of the seven rotational joints around their respective rotation axes.
In a second, alternative approach, the optimal path 15 is extracted by using an optimization algorithm to optimize the coefficients of a linearized polynomial redundant joint trajectory maximizing the value of the Yoshikawa index μ. In particular, a least-squares optimization algorithm such as the Nelder-Mead algorithm may be used, although other alternative optimization algorithms, like for example a genetic algorithm, or a neural network, such as a multilayer perceptron neural network, may also be considered.
The resulting optimal path 15 for the redundant joint in joint space may then be validated in step S805 using an accuracy and/or an energy index calculated over the whole path. For each position, the accuracy index Caccuracy corresponds to a relationship between positional change of the manipulator end-effector in Cartesian space and corresponding changes of the joint positions in joint space. The direct kinematic model of a seven-axis manipulator with seven serial rotational joints can be expressed as a matrix T1,7 fulfilling the equation:
wherein x, y and z are the positions of the manipulator end-effector in the three orthogonal axes of the Cartesian space, α, β and γ are orientation angles of the manipulator end-effector around respective orthogonal axes of the Cartesian space and θ1 to θ7 are angular positions of each one of the seven rotational joints around their respective rotation axes. Using this direct kinematic model T1,7 it is also possible to determine the effect on the position and orientation of the end-effector of small variations in the joint angles. Thus, for a position in joint space, with given joint angles θ1 to θ7, it is possible to calculate an error vector ΔX according to the following equation:
wherein Δθi,j correspond to small variations in the respective joint angle θi. For instance, for each joint i, three different variations may be chosen, Δθi,1=−0.1 rad, Δθi,2=0.0 rad, and Δθi,3=+0.1 rad. A scalar value can be calculated for the accuracy index Caccuracy on the basis of this error vector ΔX, according to the following equation:
C
accuracy=√{square root over ((Δx2+Δy2+Δz2))}+Δα+Δβ+Δγ
Consequently, this accuracy index Caccuracy decreases with increasing accuracy of the manipulator, that is, decreasing positional sensitivity of the end-effector to changes in the joint positions.
The energy index Cenergy is based on the instantaneous joint speeds for all manipulator axes along said optimal path. For an infinitely redundant multi-axis manipulator with m rotational axes in series, it can be calculated as the average of the absolute values of the angular speeds {dot over (θ)}i, of the axes i=1 to n, according to the following equation:
Consequently, this energy index Cenergy reflects the speed of the joints at each point in the optimal path. Both the accuracy index Caccuracy and the energy index Cenergy will spike near a singularity in joint space. Therefore, both these indexes, or either one of them, may be used to validate said optimal path, for instance by setting maximum thresholds for each index, or a single threshold for a sum of both indexes.
Those skilled in the art will recognize that the present invention may be manifested in a variety of forms other than the specific embodiments described and contemplated herein. Accordingly, departure in form and detail may be made without departing from the scope of the present invention as described in the appended claims.
| Number | Date | Country | Kind |
|---|---|---|---|
| 12166888.3 | May 2012 | EP | regional |
| Filing Document | Filing Date | Country | Kind |
|---|---|---|---|
| PCT/EP2013/059298 | 5/3/2013 | WO | 00 |