The invention relates to a method for determining optimized program parameters for a robot program, the robot program being used to control a robot having a manipulator, preferably in a robot cell.
Methods and systems for determining program parameters for a robot program have been known in practice for some years. These refer to the programming of a robot, wherein suitable program parameters usually need to be selected manually for the corresponding robot program.
In manufacturing industry, industrial robots are used in particular for accomplishing complex manipulation and assembly tasks as well as for surface treatment, if the workpieces to be processed or the application tasks to be carried out have a degree of variability. The ability of industrial robot arms to access almost any tool or workpiece position and orientation within their working space, in combination with suitable end effectors, enables different application tasks to be accomplished or different workpiece variants to be processed within a robot cell.
Production cells with industrial robots are traditionally programmed using text, wherein for the initial parameterization, poses or partial movements are taught via teach-in procedures using so-called teach-pendants. Numerous manufacturer-specific and cross-manufacturer commercial products facilitate the offline programming of robot cells by the automatic generation of robot code and semi-automatic path generation based on CAD models of the robot cell and the workpieces to be processed (“CAD to Path”). Component-based programming system or programming software, such as the ArtiMinds Robot Programming Suite (RPS), RoboDK or drag&bot, simplify robot programming by encapsulating atomic motion primitives into abstract program components that can be combined into complex manipulation sequences.
Symbolic, parameterizable program representations are an established practice in service-based and industrial robotics. Task models usually consist of atomic, parameterizable action primitives, which can be combined by control-flow and logic primitives into complex action sequences and translated into sequences of specific robot movements. Generalized manipulation strategies and their implementation, such as the ArtiMinds Task Model, represent action primitives as groups of possibly learned constraints in the joint-angle or Cartesian space, from which movements are generated that satisfy these constraints. In this context, reference is made to German patent DE 10 2015 204 641 A1.
Other approaches generate abstract task plans from ontology-based knowledge databases or use explicit domain-specific languages (DSLs) to specify the problem to be solved and derive actions of the robot.
In industry, the optimization of program parameters is a predominantly manual process that requires expert knowledge. Various commercial products exist for the visual and quantitative support of this process, which after aggregation and statistical evaluation of the data of robots as well as external sensors and actuators, calculate process parameters and display the data suitably processed. Examples are the commercial software solution ArtiMinds Learning & Analytics for Robots (LAR), KUKA Connect, Siemens MindSphere, Bosch Nexeed or IXON. For example, with the Teach-Point Optimization (TPO) feature, ArtiMinds LAR enables the automatic adjustment of individual robot program parameters based on statistics derived from past program executions. Most robots in complex production plants are operated in external automatic mode and are automatically parameterized at runtime by programmable logic controllers, wherein the parameter sets are usually fixed per batch. Some platforms such as MindSphere or Nexeed allow the optimization and adaptation of certain parameters of the process controller to optimize parameters such as throughput or cycle time, but operate at the macro level, so that for example, fine-tuning of program parts of a robot program is not possible.
Since the behavior of a robot is specified in software, the development and maintenance effort of robot cells is relocated from the hardware into the software. A robust solution to complex manipulation tasks with industrial robots depends to a large extent on task-specific program parameters such as speeds, accelerations, force specifications or target points, which must be precisely matched to the task to be solved, the geometry and physical properties of the robot cell as well as the workpieces to be processed. Especially when commissioning new robot cells, fine-tuning of the program parameters is very time-consuming, requires highly specialized expert knowledge and delays the productive operation of the robot cell.
The object of the present invention is therefore to design and further develop a method and a system for determining optimized program parameters for a robot program of the type mentioned above, in such a way that the process of finding optimized program parameters for the robot program is simplified or improved.
The above object is achieved according to the invention by means of the features of claim 1. According to the claim, a method for determining optimized program parameters for a robot program is specified, wherein the robot program is used to control a robot having a manipulator, preferably in a robot cell, the method comprising the following steps:
The above object is additionally achieved by the features of claim 17. According to the claim, a system for determining optimized program parameters for a robot program is specified, the robot program being used to control a robot having a manipulator, preferably in a robot cell. This system comprises:
According to the invention, it has first been recognized that it is quite a considerable advantage if program parameters that are optimized for a robot program or as optimal as possible for the respective application can be found in a maximally automated manner. In a further aspect of the invention, it has been recognized that a fine-tuning or fine adjustment of critical program parts of a robot program and their optimization with respect to application-specific target functions promises significant efficiency increases with regard to the programming, commissioning and/or maintenance phase of a robot. To define a program structure, a robot program is first created using a component-based graphical programming system based on user inputs. The robot program is formed from program components, wherein the program components can be parameterized via program parameters. The robot program therefore represents a semi-symbolic robot program. In addition, initial and thus preliminary program parameters for the program components of the robot program are generated or defined.
According to the invention, one or more critical program components can then be selected using a provided interface, wherein optimizable program parameters for the critical program components can be defined. In the course of an exploration phase, an automatic and stochastic exploration of a parameter space is carried out with regard to the optimizable program parameters. For this purpose, the robot program is executed multiple times or repeatedly, wherein an automatic sampling of the parameter space is carried out for the critical program components and resulting trajectories of the robot are recorded. Thus, training data can be collected for the critical program components at each execution of the robot program.
In the course of a subsequent learning phase, component representatives for the critical program modules of the robot program are generated, which uses the training data collected during the exploration phase. A component representative is a system model that, in the form of a differentiable function, maps a specified state of the robot measured or ascertained during the exploration phase and specified program parameters to a predicted—i.e. an expected—trajectory.
Finally, optimized program parameters for the critical program components of the robot program are determined in an inference phase. For this purpose, by means of a gradient-based optimization procedure using the previously generated component representatives, the optimizable program parameters of the component representatives are iteratively optimized with respect to a specified target function. For example, this results in an optimal parameter vector for each critical component. The optimized program parameters can be automatically transferred to the robot program. Thus, a robot program with optimum program parameters with respect to a specified target function can be achieved.
Consequently, using the method according to the invention for determining optimized program parameters for a robot program and using the system according to the invention, a simplified and improved process of finding optimized program parameters is possible.
A “component-based graphical programming system” can be understood—in particular in the context of the claims and preferably in the context of the description—as a programming system or a programming software that allows an encapsulation of atomic motion primitives into abstract program components, wherein the program components can be combined to form complex manipulation sequences. The ArtiMinds Robot Programming Suite (RPS), RoboDK or drag&bot are just some examples of possible component-based programming systems.
A “semi-symbolic robot program” can be understood—in particular in the context of the claims and preferably in the context of the description—as a robot program that has a symbolic structure (composed of individual program components), but the components of which (the program components) are variable in their behavior, because the exact behavior of the components also depends on the parameterization). Discrete program components, but which can each be parameterized, have both properties and can therefore be regarded as semi-symbolic. A component-based graphical programming system can generate a semi-symbolic robot program.
At this point it should be noted that a “program component”—in particular in the context of the claims and preferably in the context of the description—can be understood as the smallest unit of a symbolic or semi-symbolic robot program that can be configured by the user. The program component represents a predefined action of the robot. Program components can be combined sequentially into complex robot programs. Program components can be parameterized, i.e. they accept a vector of parameters, the values of which can be specified by the robot programmer when the robot program is created. Each program component has exactly one type that determines the action that the program component represents. Examples of program components are “Gripping”, “Point-to-point movement”, “Contact run (relative)”, “Arc run”, “Spiral search (relative)”, “Torque-controlled joining”, “Force-controlled pressing”, “Palletizing”, etc.
A “critical program component” can be understood—in particular within the context of the claims and preferably within the context of the description—as a program component for which optimized parameters are to be determined.
A “component representative” can be understood—in particular within the context of the claims and preferably within the context of the description—as a system model for a program component, which models the behavior of the program component during its execution. For example, in the context of the definition of a system model, a component representative can map a vector of input parameters and the system state present at the time of execution onto the trajectory to be expected during execution, wherein as part of the definition of a system model the system can include the robot arm and, if necessary, the environment of the robot and, if necessary, the objects manipulated during the execution of the program component.
A “system model” can be understood—in particular in the context of the claims and preferably in the context of the description—as a mathematical model which approximates the behavior of a system in a simplified way. For example, a “system model” can be defined as a mathematical function ƒ which outputs the expected trajectory I″ given the input parameters x and the system state p. ƒ therefore implicitly includes the program logic (the translation of x into control commands for the robot by the robot program), the kinematics and dynamics of the robot, and the physical properties of the environment.
In particular, in the context of the claims and preferably in the context of the description, a “trajectory” can be understood as a sequence of vectors sampled with a fixed sampling interval, which can contain information about the state of the robot and optionally also about its environment. Solely as an example as part of an advantageous embodiment, trajectories can contain one or more of the following types of information at each time step:
In an extended—purely exemplary—configuration, it would be conceivable and advantageous to extend trajectories to include one or more of the following types of information:
Advantageously, parameter domains can be defined for the optimizable program parameters of the critical program components, wherein the optimizable program parameters are optimized over the parameter domains. A parameter domain represents a permissible value range for the optimizable program parameter. Advantageously, a permissible value range or parameter domain is provided for each program parameter that can be optimized.
In a further advantageous way, the parameter domains for the optimizable program parameters of the critical program components can be specified and/or are predefinable or adjustable. Thus, a domain can also be preset. This means that the parameter domain for a program parameter could already be specified by the underlying system. Furthermore, it is conceivable that a robot programmer/user selects a parameter domain for the program parameters of the critical program components to be optimized, over which the optimization will be performed. This parameter domain is application-specific and advantageously can be chosen narrowly enough to meet safety requirements on the manufacturing process as well as minimum quality and cycle-time requirements.
With a view to obtaining suitable training data, the optimizable program parameters can be sampled from their respective parameter domain during the exploration phase to sample the parameter space. This means that the program parameters that can be optimized are randomly selected as a sample from the parameter domain. It is conceivable that the optimizable program parameters are sampled in an equally distributed manner, i.e. an equally distributed sampling is carried out. This provides the advantage that any sampling errors are spread widely across the sampling space. Equally distributed sampling provides sufficient randomness to avoid systematic undersampling, while ensuring uniform coverage of the parameter space. Furthermore, it is conceivable that the optimizable program parameters are sampled adaptively. Thus, an adaptive sampling can be performed that conveniently samples there or in the regions where more information is needed.
Advantageously, the robot program can be stored in a serialized form, preferably in a database, in a format that allows a reconstruction and parameterization of the robot program or its program components. Also advantageously, the format may comprise a sequential execution sequence of the program components, types of program components, IDs of program components, constant program parameters, and/or optimizable program parameters. The format and the stored data can therefore enable a particularly efficient handling and implementation. Further advantages of these features include the possibility to create the overall system models, consisting of sequences of the component representatives for the components contained in the program structure, fully automatically based on the stored program structure, component types and component parameters. Another advantage is the facility to reuse data from earlier explorations (possibly for other robot programs) in parts for training new component representatives (e.g. for execution in modified environments, etc.) for components of the same component type at any later time, as the specified format allows the subsequent readout of component types and parameters.
In an advantageous way, for one or for each execution of the robot program carried out in the exploration phase, a resulting sampled trajectory can be stored in such a way that an associated program component and a parameterization of the associated program component can be uniquely assigned to each data point of the trajectory at the time of the respective execution. This enables particularly efficient handling and implementation of the data stored with the trajectory. An advantage of this format is the possibility to use the stored data retrospectively at any later point in time for training new component representatives of the same type, since the sub-trajectories for program components of specific types can be directly assigned and extracted from the overall trajectory.
With regard to the collection of training data in the exploration phase, the robot program can be executed automatically, wherein at least 100 executions, preferably at least 1000 executions, of the robot program are carried out to obtain the training data. The automated execution of the robot program has the advantage that no human resources are tied up during the exploration phase and it enables the time- and resource-efficient collection of real training data. The number of executions of the robot program during the exploration phase advantageously affects the quality of the program parameters optimized in the inference phase, since a higher number of training data samples means a finer sampling of the parameter space and the system behavior, allowing the neural networks underlying the component representatives to learn to approximate the system behavior more robustly and more precisely given different parameters. Since the component representatives form the basis for the system for optimizing the program parameters, with larger amounts of training data comprehensively optimized parameter sets can be expected that come closer to the globally optimal parameterization.
The training data collected in the exploration phase for each execution of the robot program can advantageously comprise a parameterization, in particular constant and/or optimizable program parameters, of the critical program components and a resulting sampled trajectory of the critical program components. This means that the component representatives can be generated during the learning phase. The optimizable program parameters that were randomly sampled, i.e. randomly generated, in the exploration phase can thus be stored as part of the training data and associated with the execution of the robot program. The common storage of program parameters and trajectories simplifies the implementation considerably, since only one database or one storage format needs to be integrated.
Advantageously, the training data collected in the exploration phase for each executed program component can additionally comprise an ID (that is, an identifier or a code) and/or a status code. The ID can be used to assign a component and a parameter to the component, as well as a trajectory to the component. The status code can be used to store success/failure of the execution and can therefore be an important part of the program component semantics that the component representatives can learn. Thus, for example, the error rate can be minimized as a target function for the optimization. As a result, the range of possible target functions to use is expanded.
With regard to the efficient generation of component representatives, in the learning phase for the critical program components, learnable component representatives can be generated first, wherein the learnable component representatives are trained with the training data of the exploration phase, in order then to represent system models for sub-processes encapsulated in the associated critical program components as trained component representatives. This enables the simple software-based implementation of component representatives as object-oriented classes (for each type of program component there is one (software) class, which includes the implementation of the trajectory generator for this component type, the architecture of the neural network, and the logic necessary for the training). These classes only need to be developed once (e.g. as part of a software product for graphical robot programming) and can then be repeatedly instantiated to specific component representatives, the neural network of which is then trained.
Advantageously, the component representatives can comprise a recurrent neural network. This provides a universal applicability. Since the recurrent neural network uses a deep neural network as a system model, the described procedure does not make any assumptions about the nature (e.g. parametric distribution, normal distribution, linearity) of the input and output data and can therefore be used in all production domains as well as in principle for all component types. Since no further requirements are placed on the target function except for the ability to be differentiated, any target functions are conceivable. The method can therefore be used in any application domain, such as assembly, surface treatment or handling, and enables the optimization of robot programs with regard to any process indicators or quality criteria.
With regard to an efficient generation of component representatives, an analytical trajectory generator can be placed upstream of the recurrent neural network, which is conveniently implemented in a differentiable form. The analytical trajectory generator is designed to generate a prior trajectory. Since long, finely sampled trajectories in particular contain a lot of redundant information and when using neural networks for prediction large sequence lengths can significantly complicate the learning problem, this is counteracted by placing an analytical trajectory generator upstream of the neural network. This generates a prior trajectory. For example, the trajectory generator can consist of a differentiable implementation of an offline robot simulator. Thus, for example, software libraries for motion planning with robots, such as Orocos KDL (https://www.orocos.org/kdl) or Movelt (https://moveit.ros.org/) can be modified by adding a capability to differentiate the output (the prior trajectory) with respect to the input parameters. Specifically, the algorithms implemented there for motion planning can be converted into differentiable calculation graphs. This conversion can be performed in an exemplary implementation according to one exemplary embodiment by reimplementing the planning algorithms using the software library PyTorch (https://pytorch.org/), which guarantees the differentiability. The prior trajectory can correspond to a generic execution of the program component without considering the environment, i.e. in an artificial space with zero forces and under idealized robot kinematics and dynamics, starting from a given initial state. This strong prior can be combined with the component parameters to form an augmented input sequence for the neural network. The network can then be trained to predict the residual between the prior and posterior (i.e. actually measured) trajectory, as well as the probability of success of the component execution. The addition of residual and priors can result in the expected posterior trajectory output for this program component and the given component parameters. A simplification of the learning problem in the training of neural networks by the introduction of strong priors is established practice. The use of strong priors can significantly reduce the need for training data by an order of magnitude. This effect is particularly noticeable in long trajectories or with strongly deterministic trajectories. The use of a differentiably implemented analytical generator as a strong prior is therefore particularly advantageous.
Advantageously, the target function can be defined in such a way that the target function maps a trajectory to a rational number and that the target function is differentiable with respect to the trajectory. The use of a consistent function signature for the target function allows the simple exchange of target functions without having to adapt the optimization algorithm. The proposed signature is sufficiently simple (target function as an evaluation of a trajectory with a numerical value), to ensure simple implementation, but at the same time allows the implementation of almost any target function. The differentiability of the target function with respect to the trajectory allows the use of gradient-based optimization methods for the parameter inference, which by examining the gradient information, converge in a goal-oriented manner in the direction of at least local optima and therefore converge much faster than non-gradient-based optimizers for many classes of target functions.
Advantageously, the target function can comprise a predefined function, a parametric function, and/or a neural network. Three types of target functions are thus possible. These three types can also be advantageously combined with one another in arbitrary ways. Predefined functions can relate to classical process parameters such as cycle time or path length, which output a variable to be minimized. Parametric functions can include predefined functions that have additional, possibly user-definable, parameters. Examples are distance functions to specified values such as contact forces, tightening torques, or Euclidean target poses. Neural networks can also be used as differentiable function approximators for complex target functions.
Examples of simple target functions that map process parameters are cycle time, path length, and/or error rate. Other more complex types of target functions may include, for example, compliance with force limits, force minimization with simultaneous cycle time minimization, increase of precision (e.g. in stochastic position deviations of workpieces), minimization of torques, specification of particular force or torque curves, etc.
Advantageously, the target function may include a force measurement-based function. In this case, at least parts of the predicted trajectory are evaluated on the basis of the predicted forces and torques. This is particularly advantageous because the optimization of program parameters with respect to optimality criteria defined over forces is very difficult for human programmers, since the relationships between program parameters and the resulting forces during program execution are difficult for human beings to calculate or understand. Programs for manufacturing processes with critical contact or joining forces are often especially difficult to optimize for human programmers, since the forces applied cannot be systematically calculated by human beings and therefore any set of parameters found by a human programmer by testing different parameterizations will usually be suboptimal. The automatic optimization of program parameters can provide particular added value here.
Advantageously, a critical sub-sequence of the robot program can be selected using the interface for selecting one or more critical program components, wherein the critical sub-sequence comprises multiple critical program components. The component representatives of the multiple critical program components can be combined to form a differentiable overall system model. The overall system model maps the program parameters of the critical sub-sequence onto a combined trajectory, so that for a contiguous sub-sequence of critical program components the optimizable program parameters are optimized with respect to the target function. This enables a holistic parameter optimization. The program parameters of associated program component sequences can be optimized together. This offers added value compared to local optimization at the component level, since interactions with the environment are considered across component boundaries during the optimization. In particular, conflicting parameter configurations between program parts can be automatically balanced against each other. This is the case, for example, when increased speed of a movement reduces the probability of success of a subsequent movement, for example by creating vibrations or bending of pins during contact runs. This holistic approach to optimizing program parameters is of considerable advantage.
As part of an advantageous embodiment of a method according to the invention, input data, procedure, and output data can be specified as follows:
The result is a robot program with optimal or optimized parameters with respect to a specified target function.
Advantageous embodiments of the invention can provide a method and a corresponding system for the fully automatic inference of optimized program parameters for industrial robots, which allows robot programmers or plant workers during the programming, commissioning and maintenance phases of robot cells to optimize the parameters of complex robot programs in the presence of variable processes and workpieces with regard to cycle time and quality specifications automatically and in a data-driven manner. For this purpose, a method and/or system according to an exemplary embodiment of the invention comprises components or modules or units that are used for automated exploration of the parameter space, modeling, specification of target functions, and the inference of optimal/optimized parameters. A robot program with optimized parameters can be achieved and thus a robot cell with higher throughput, higher manufacturing quality or lower reject levels.
Advantageous embodiments of a method according to the invention or a system according to the invention can have one or more of the following advantages:
There are now various options for designing and further developing the teaching of the present invention in an advantageous manner. For this purpose, reference is made both to the claims subordinate to claim 1 and to the following explanation of preferred exemplary embodiments of the invention on the basis of the drawing. In connection with the explanation of the preferred exemplary embodiments of the invention based on the drawing, generally preferred embodiments and further developments of the teaching are also explained.
In the drawings
From a process point of view, the method according to an embodiment of the invention has different versions or possible applications in the programming, commissioning and maintenance phases of production plants or robot cells.
I. Defining the program structure: The robot programmer creates a robot program from parameterizable program components (motion templates), which map atomic movements of the robot. The robot program consists of a sequence of arbitrary force- or position-controlled program components. The sequence of program components maps the steps necessary to solve the application task.
The execution semantics of a component of the type “Linear motion” 3 or 7 (cf.
The execution semantics of “Contact run (relative)” 8 (cf.
II. Definition of the initial program parameters: A robot programmer can manually define the initial parameters of the program components using common methods (teach-in, CAD to Path, . . . ) to solve the application task approximatively (possibly violating the specified cycle times and quality requirements).
III. Fine tuning of the parameters of relevant sub-programs: The robot programmer uses a method according to an exemplary embodiment of the invention for the automatic optimization of program parameters to meet cycle-time specifications and quality requirements.
III.a. Selection of critical sub-programs: The robot programmer selects critical sub-sequences of the program (i.e. critical sub-programs) or individual critical program components, the program parameters of which are to be optimized.
III.b. Selection of the parameters to be optimized: Depending on the environment and application task, certain program parameters of the critical sub-programs or the critical program components must be labeled as constants in order to ensure safety or quality requirements. This concerns, for example, target poses of movements in areas of the robot cell with restricted accessibility or lower or upper force limits of force-controlled movements. The designation of constant parameters is application-specific and requires domain knowledge, but in many cases can be determined already at the cell design stage using the CAD models of the cell, process simulation software, if used, and offline robot simulation software.
Program parameters of a program component can be either input (target poses, target forces, etc.) or intrinsic parameters (velocity, acceleration). Both parameter types can be optimized.
III.c. Definition of the domain for optimizable parameters: For each program parameter of the critical sub-programs or the critical program components that is to be optimized or optimizable (i.e. not constant), the robot programmer can select a permissible value range over which the parameter is to be optimized. This is application-specific and usually sufficiently narrow that all safety requirements on the manufacturing process as well as minimum quality and cycle-time requirements can be satisfied.
III.d. Exploration phase: An automatic stochastic exploration of the parameter space is carried out. The robot program is now executed automatically under realistic conditions, but not yet in the production environment (for example, 1000<N<10,000). For each execution, the program parameters to be optimized are sampled from their respective domain. For example, this takes place via an equally distributed sampling. During execution, the position and orientation of the tool center point (TCP) of the robot as well as the forces and torques occurring at the TCP are sampled at an arbitrary but fixed sampling interval (8 ms<Δt<100 ms) and stored in a database. In addition to the data of each executed program component, an ID with which the program component can be identified in the robot program as well as a status code are transferred from the robot to the database. The status code identifies whether the executed action was successfully completed, according to the semantics of the program component. Force-controlled runs to contact end successfully, for example, if contact has been established and the contact force is within a set tolerance range. In addition, the randomly generated program parameters are stored in the database and associated with the program execution. The sampling interval Δt is application-specific and can be specified by the programmer. Large sampling intervals reduce the amount of data to be processed and stored and simplify the learning problem, reducing the number of necessary program executions (N), but leading to aliasing and undersampling in high-frequency or vibrating processes. The number of program executions N is also application-specific and depends on the complexity and length of the robot movements, the (non-)linearity of the force and torque profiles during interactions of the robot with the environment, and the stochasticity of the process. If workpiece variances are expected, workpieces of different batches should be used during the exploration phase in order to teach in the workpiece variances.
III.e. Learning phase: The system models are automatically trained. For each program component of the critical sub-programs, on the basis of the previously collected parameter sets and trajectories a system model is learned which maps the component parameters to the expected positions and orientations of the TCP, the expected forces and torques as well as the expected status code. No user interaction is required for the training. The duration of the training depends on the number and complexity of the program components as well as on the number, length, and sampling characteristics of the trajectories in the training data set.
In this context, a “system model” can be defined as a mathematical function ƒ, which outputs the expected trajectory Ŷ given the input parameters x and the system state p. ƒ therefore implicitly includes the program logic (the translation of x into control commands for the robot by the robot program), the kinematics and dynamics of the robot, and the physical properties of the environment.
III.f. Specification of the target function: An arbitrary target function is defined, with respect to which the program parameters are to be optimized. Each target function is valid if it maps a trajectory to a rational number and can be differentiated with respect to the trajectory. Concave target functions simplify the optimization problem because they only have one (global) maximum and the result of the optimization is independent of the initial parameterization. For non-concave target functions with local maxima, the optimization is sensitive to the initial parameterization. Arbitrary target functions can be combined by weighted addition, wherein local maxima can be created by the addition. By using iterative Monte-Carlo methods, the convergence of the optimization to globally optimal parameter sets, given the correctness of the learned system model, can be asymptotically guaranteed. The specification of the target function is application-specific and may need to be carried out by an expert in the respective production domain. A gradient-based optimization method is used for the optimization and the target function is expressed as a loss function for the equivalent minimization problem. Examples of simple loss functions are the cycle time, the path length in the Cartesian or configuration space, or the error probability. Complex loss functions are the distance to one or more reference trajectories, for example from human-performed demonstrations, or the deviation of specified contact forces at the end of a trajectory or during the execution of a program component. An initial target function can be automatically generated by inference over a knowledge base from the semantics of the components of the critical program parts and adjusted by the programmer using a graphical user interface.
III.g. Inference phase: The system models are optimized automatically. For each critical sub-program, the learned system models of the associated program components are automatically combined to form an overall model, which maps the parameters of the sub-program to the combined sub-trajectory. A gradient-based optimization algorithm iteratively optimizes the program parameters with respect to the specified target function. The optimized program parameters are automatically transferred to the robot program.
IV. Manual acceptance by the programmer/user: The robot programmer runs the optimized robot program repeatedly and ensures compliance with all safety, cycle-time and quality requirements. Quantitative, statistical methods may be used for the measurement and process parameters.
I. Adjustment of program parameters during ramp-up: Once the robot cell has been integrated into the rest of the production line, production usually starts with lower quality, reduced quantities, or higher reject rates. This is often due to minimal deviations in the environment, workpieces or structure compared with the programming phase. The usual practice is the manual, iterative adjustment of the program parameters in order to bring the process back within the specified cycle-time and quality limits. Existing tools for automatic process optimization or for tuning controller parameters only partially automate the optimization process and only for certain parameters or movements. Using a simplified version of the procedure described in A.III, the operator can adjust the parameters of the robot program fully automatically to suit the changed conditions. Steps A.III.a to A.III.c can be skipped, because the hyperparameters of the method set there are robust against stochastic changes in the system or environment. The number of training data samples required (cf. A.III.d) is a factor of 10-20 lower than in the programming phase, since the existing system models can be reconditioned to the changed environment using transfer learning methods. Step A.III.f can also be skipped in many cases if the cycle-time and quality specifications have not changed compared to the programming phase. Here, however, it is also possible to adapt the target function to the changed conditions in the plant.
I. Compensation of process and workpiece variances: During production runs, changes in the environment, the production plant or the workpieces may occur. If a manufacturer or batch is changed, components may have different surface or bending properties. In addition, the system behavior can change over the course of the operating time of the plant due to maintenance work on the plant, replacement of motors and sensors, or wear effects. Using a simplified version of the procedure described in A.III, the operator can adjust the parameters of the robot program fully automatically to suit the changed conditions. Steps A.III.a to A.III.c can be skipped, because the hyperparameters of the method set there are robust against stochastic changes in the system or environment. The number of training data samples required (cf. A.III.d) is a factor of 10-20 lower than in the programming phase, since the existing system models can be reconditioned to the changed environment using transfer learning methods. Step A.III.f can also be skipped if the cycle-time and quality specifications remain the same.
II. Adaptation to new target specifications: If, for example due to reconfigurations at other points on the production line, cycle-time specifications or quality requirements change, the operator can adapt the parameters of the robot program to the new specifications by executing steps A.III.f and A.III.g by specifying a corresponding target function. The existing system models remain valid and can be reused without retraining.
a. Robot cell 9 with six-axis industrial manipulator: It is assumed that it is possible to measure forces and torques at the TCP. An external force-torque sensor may be required for this.
b. Component-based graphical programming system 10 for programming and executing robot programs: For the creation of the initial robot program, its parameterization and execution on the robot controller, a software system with a graphical user interface is required which can process semi-symbolic robot programs, compile them into executable robot code and execute them on the robot controller.
c. Database 11 for robot programs and trajectories: In database 11 robot programs are stored in serialized form in a format that allows the reconstruction of the program structure and parameterization (execution sequence, type and unique IDs of the program components, constant and optimizable parameters of the program components). For each execution of the robot program, the database contains a sampled trajectory consisting of the position and orientation of the TCP, forces and torques on the TCP, and the status code of the program component belonging to the data point. The memory format is such that the associated program component and the parameterization of the program component can be uniquely assigned to each data point of a trajectory at the time of execution.
d. Learning system 12 for differentiable component representatives: The learning system 12 transforms a serialized representation of the program structure of the critical sub-programs into a set of differentiable (parameter-optimized) motion primitives. Each differentiable motion primitive is a functionally equivalent analog (“representative”, “system model”) to a component instance from the sub-program, which maps the parameters of the component instance onto a trajectory expected during execution.
A component representative is defined as a system model at the component level or a model of the execution of the corresponding program component. A component representative for program component B is therefore a mathematical function ƒB which, given the input parameters xB of the program component and the system state p, outputs the expected trajectory ŶB that will result when the program component is executed on the robot. Component representatives are therefore mathematical models of the execution of program components. These models can be learned on the basis of training data and can be differentiated, i.e. they allow the calculation of the derivative of ŶB with respect to xB. This allows the optimization of xB with gradient-based optimization methods. Since all component representatives are differentiable models of the execution of program components, a program according to
e. Knowledge base or ontology 13 of component-specific sub-targets: In many cases, the target function for the parameter optimization contains sub-targets that result directly from the execution semantics of the component types. For example, a force-controlled contact run has an implicit contact target in a specified force range. These implicit sub-targets are stored in a knowledge base in the form of an ontology. At the time of the specification of the target function, reasoning over the ontology is used to create an initial target function from the given program structure, which maps these implicit sub-targets. This can be adapted by the user and supplemented by additional application-specific sub-targets. The use of ontologies or knowledge bases for automatic bootstrapping of target functions represents a major advantage.
An ontology is a structured representation of information with logical relations (a knowledge database), which makes it possible to draw logical conclusions (reasoning) from the information contained in the ontology using suitable processing algorithms.
Most ontologies follow the OWL standard (https://www.w3.org/OWL/). Examples of ontologies are BFO (https://basic-formal-ontology.org/) or LapOntoSPM (https://pubmed.ncbi.nlm.nih.gov/26062794/). The most common software framework for reasoning is HermiT (http://www.hermit-reasoner.com/). OWL and HermiT can be used in an exemplary implementation according to an exemplary embodiment.
In an exemplary reference implementation according to an exemplary embodiment of the invention, the developed ontology forms a “database for predefined target functions”, on which by reasoning from a given semi-symbolic robot program it is possible to automatically derive target functions which due to the fixed semantics of the program blocks must always be valid, for example, that a “Contact run (relative)” component should produce a contact force along the Z-axis of the tool coordinate system or that in a “linear motion” component the target point should be reached as precisely as possible. This reduces the task of specifying the target function for the user to the aspects of the target function that do not already follow from the semantics of the program components, but, for example, from the application (contact forces, speeds, . . . ) or for business-related reasons (minimization of the cycle time, . . . ).
f. System 14 for specifying differentiable target functions: Differentiable target functions are initially calculated in software by means of reasoning over the knowledge base of the component-specific sub-targets and can then be edited by the user using an interface if necessary. The resulting internal representation of the combined target function is then translated into a differentiable calculation graph of the loss function for the equivalent minimization problem.
Three types of target functions are possible and can be combined with one another as required:
g. Inference system 15 for optimal robot parameters: The inference system 15 forms an end-to-end optimizable calculation graph for each critical sub-program by considering the specified target function and the trained component representatives. On this graph, the inference algorithm calculates the optimal program parameters for the specified target function. This system is novel in its design and application in industrial robotics.
In the context of an exemplary embodiment of the invention, the following phases—namely exploration phase, learning phase and inference phase—can be executed and implemented, components of this exemplary embodiment being illustrated by
Automatic sampling of the parameter space: The automatic random sampling of parameter configurations (or the optimizable program parameters) from their respective domains was implemented in an exemplary reference implementation using the external programming interface of the ArtiMinds Robot Programming Suite.
Generating a learnable representative for each critical component: Core of a system according to the exemplary embodiment is a representation of program components, which allows the gradient-based optimization of the parameters with respect to a target function. Basically, the inference problem of optimal parameters is divided into a learning phase and an inference phase, wherein in the learning phase a model of the system (robot and environment during the execution of a module) is learned and in the inference phase a gradient-based optimization algorithm optimizes the input parameters of the component representative using the learned system model.
Component representatives map the component parameters to an expected trajectory and guarantee the differentiability of the output trajectory with respect to the component parameters. This mapping is realized by means of a recurrent neural network. Since long, finely sampled trajectories in particular contain a lot of redundant information and when using neural networks for prediction large sequence lengths significantly complicate the learning problem, an analytical trajectory generator is placed upstream of the neural network, which generates a prior trajectory (cf.
The addition of the residual and priors results in the output expected posterior trajectory for this program component and the given component parameters. Simplifying the learning problem in the training of neural networks by introducing strong priors is established practice. Algorithmic priors can be defined both by the specific network structure (cf. R. Jonschkowski, D. Rastogi, and O. Brock, “Differentiable Particle Filters: End-to-End Learning with Algorithmic Priors,” ArXiv180511122 CS Stat, May 2018, Accessed: Apr. 3, 2020. [Online]. Available at: http://arxiv.org/abs/1805.11122) as well as by representing the output values as parameters of predefined parametric probability distributions (cf. the use of Gaussian processes, for example, in M. Y. Seker, M. Imre, J. Piater, and E. Ugur, “Conditional Neural Movement Primitives”, p. 9) or Gaussian mixes in A. Graves, “Generating Sequences with Recurrent Neural Networks,” ArXiv13080850 Cs, June 2014, Accessed: Nov. 22, 2019. [Online]. Available at: http://arxiv.org/abs/1308.0850). In this case, aspects of the velocity profile, the coarse positioning in the working space in absolute coordinates as well as deterministically pre-planned movements are generated by the generator and no longer need to be learned. In the case of force-controlled spiral search movements, the problem is partially linearized, since the deterministic spiral shape does not have to be learned as well, but only the deviations of the real from the planned trajectory. The use of strong priors can significantly reduce the need for training data by an order of magnitude. This effect is particularly noticeable in long trajectories or with strongly deterministic trajectories. When training a component representative for the force-controlled spiral search, the required amount of training data can be reduced by a factor of 20 as part of one exemplary embodiment. The use of a differentiably implemented analytical generator as a strong prior is a considerable advantage.
Training of the learnable representatives as system models for the sub-process encapsulated in the associated component:
Combination of the learned representatives into complete system models for each contiguous sequence of critical components:
The increment (lr or λ) is a globally adjustable hyperparameter of the optimization algorithm, the choice of which depends on the application domain, limitations in the computation time for the optimization, and the desired convergence properties of the optimization method. For large values of λ, Adam converges faster, but with unfavorable combinations of target functions it can oscillate. For small values of λ, Adam converges more slowly, but oscillates much less and terminates closer to the global optimum. Depending on the nature of the procedure, the Adam optimization algorithm can be supplemented by mechanisms such as weight decay or learning rate scheduling, to dynamically balance convergence and runtime. The Autograd library of PyTorch is used to calculate the gradients (backpropagate). Apart from the optimizable input parameters of the components (optimizable_params), all other parameters (constant component parameters, but also the weights of the neural networks within the component representatives) remain constant.
The network maps inputs (left) to outputs (right).
Inputs:
Outputs:
From left to right, the following function is performed:
The training is particularly effective when the network is trained on batches of training data in parallel on a graphics card (GPU). The batch dimension has been omitted in
With regard to further advantageous configurations of the method according to the invention and the system according to the invention, reference is made to the general part of the description and to the attached claims in order to avoid repetition.
Finally, it should be expressly pointed out that the above described exemplary embodiments of the method according to the invention and the system according to the invention serve only to elucidate the claimed teaching, but do not restrict it to the exemplary embodiments.
| Number | Date | Country | Kind |
|---|---|---|---|
| 10 2020 209 511.6 | Jul 2020 | DE | national |
| Filing Document | Filing Date | Country | Kind |
|---|---|---|---|
| PCT/DE2021/200076 | 6/4/2021 | WO |