The following relates to a method and configuration system for configuring a machine controller.
Complex machines, such as e.g., robots, motors, machine tools, turbines, 3D printers, production plants or motor vehicles normally require a controller having a complex configuration for productive operation in order to optimize the performance of the controlled machine in a targeted manner. The performance to be optimized can involve e.g., output, yield, resource requirement, efficiency, precision, pollutant emission, stability, wear and/or other target parameters of the machine.
Machine controllers are frequently configured by experts during the design or commissioning of the machine to be controlled. However, such a configuration by experts is often relatively time-consuming. Furthermore, in many cases, additional effort is required in order to optimize the configuration.
Data-driven machine learning methods are increasingly used for automatically configuring machine controllers. A machine controller can be trained by learning methods of this type to derive those control actions which specifically effect a desired or otherwise optimum behavior of the machine from current operating signals of the machine. A multiplicity of known machine learning methods, in particular reinforcement learning methods, are available for this purpose.
In many cases, however, a configuration optimized by data-driven learning methods is barely comprehensible to or interpretable by experts in terms of its effect relationships. This frequently hinders validation or further development of configurations of this type.
An aspect relates to a method and a configuration system for configuring a machine controller, by which more readily interpretable configurations can be automatically created.
In order to configure a machine controller by an action execution tree specifying an execution of actions by a machine, predefined action patterns are read in. A multiplicity of action execution trees are further generated for the machine. The action execution trees can be, in particular, behavior trees. For a respectively generated action execution tree, a performance for controlling the machine on the basis of the respective action execution tree is determined. The predefined action patterns are further searched for in the respective action execution tree. An action pattern found in the respective action execution tree is then replaced at least in part by a reference to the predefined action pattern. A tree size of the thus modified action execution tree is further determined. Based on the generated action execution trees, an action execution tree that is optimized with a view to greater performance and smaller tree size is then determined by a numerical optimization method and is output in order to configure the machine controller.
A configuration system, a computer program product (non-transitory computer readable storage medium having instructions, which when executed by a processor, perform actions) and the non-transitory computer-readable, desirably non-volatile, storage medium are also provided in order to carry out the method according to embodiments of the invention.
The method according to embodiments of the invention and the configuration system according to embodiments of the invention can be executed or implemented, for example, by one or more computers, processors, application-specific integrated circuits (ASIC), digital signal processors (DSP) and/or field programmable gate arrays (FPGA).
Due to the at least partial replacement of found action patterns with references, an action execution tree can be made smaller in many cases. The complexity of the action execution tree can normally be reduced in this way. If the tree size of action execution trees is used along with the performance as an optimization criterion, the optimization can be driven in the direction of action execution trees which have predefined action patterns. In this way, an action execution tree or a configuration which normally offers high performance and also has low complexity can be determined. Configurations having low complexity are normally more readily interpretable and can therefore be more easily validated and/or further developed by experts.
According to an embodiment of the invention, the multiplicity of action execution trees can be generated at least partially on the basis of the action patterns. In particular, action execution trees can be composed at least partially from the action patterns. In this way, action execution trees having known and/or readily interpretable action patterns can be generated in a targeted manner.
According to an embodiment of the invention, a predefined initial action execution tree can be read in. The multiplicity of action execution trees can then be generated at least partially on the basis of the initial action execution tree. The initial action execution tree can be, in particular, an action execution created, tested and/or validated by an expert. In this way, expert specifications can be taken into account in determining an optimized action execution tree.
In addition, the initial action execution tree can be executed by an interpreter, wherein a reference sequence of machine actions specified by the initial action execution tree is determined. The generated action execution trees can then also be executed in each case by the interpreter, wherein a respective sequence of machine actions specified by the respective generated action execution tree is determined. A generated action execution tree of which the respective sequence does not match the reference sequence can therefore be rejected. The optimization can be restricted in this way to action execution trees which reproduce the reference sequence. If the initial action execution tree is selected in such a way that machine actions thereby specified, or a predefined selection of these machine actions, only comprise permissible or necessary target machine actions, it can be ensured in this way that the optimization remains restricted to action execution trees which meet predefined development objectives.
Predefined selection information can be read in, by which the machine actions, the respective sequence, in particular reference sequence, of which is to be determined are selected. In this way, in particular, the machine actions which are intended for mandatary execution and to which the optimization is to be restricted can be selected from the machine actions specified by the initial action execution tree.
According to an embodiment of the invention, the numerical optimization method can be a genetic optimization method, by which the multiplicity of action execution trees are also generated. The genetic optimization method can be initialized, in particular, by the initial action execution tree. A fitness function which assigns a greater fitness to higher performance and/or to smaller tree size than to lower performance and/or to larger tree size can be provided for the genetic optimization method. In this way, the optimization can be driven in the direction of high-performing and/or less complex configurations. A multiplicity of efficient standard routines are available to carry out genetic optimization methods of this type.
According to an embodiment of the invention, a Pareto front can be determined for the multiplicity of action execution trees, wherein an increase in performance and a reduction in tree size are used as Pareto objective criteria. The optimized action execution tree can then be derived from action execution trees of the Pareto front. In particular, the optimized action execution tree can be selected or interpolated from action execution trees of the Pareto front. Whereas a selection can be carried out particularly simply, a plurality of advantageous characteristics of action execution trees can also be combined in an interpolation. If the Pareto front is determined with a view to performance and tree size, an optimized action execution tree can be determined which not only offers high performance but also has a small tree size and therefore a low complexity. In this context, a Pareto front is understood to mean a set of action execution trees of which the distance to a mathematically exact Pareto optimum e.g., falls below a given threshold value. Through a restriction to action execution trees of the Pareto front, a space of possible action execution trees is normally substantially restricted, wherein, in particular, non-optimal action execution trees are eliminated. The determination of the optimized action execution tree and possibly further optimizations can thus be substantially simplified.
Action execution trees of the Pareto front can be fed into the genetic optimization method. Alternatively or additionally, action execution trees not contained in the Pareto front can be rejected. In this way, the genetic optimization method can be enriched with advantageous action execution trees. Alternatively, the optimization method can be terminated following the determination of a Pareto front.
In order to determine the performance of a respective action execution tree, a simulation model of the machine, a data-driven model of the machine, the machine itself and/or a machine similar to it can further be controlled by the respective action execution tree and a performance of the machine resulting therefrom can be determined. A surrogate model of the machine which requires fewer computing resources than a complete physical simulation can be used as a simulation model.
In addition, a number of edges, a number of nodes and/or a tree depth of the modified action execution tree or a weighted combination of the number of edges, the number of nodes and/or the tree depth can be determined in order to determine the tree size. The above criteria allow a particularly simple evaluation of the tree size or the complexity of a respective action execution tree.
Some of the embodiments will be described in detail, with references to the following Figures, wherein like designations denote like members, wherein:
If the same or corresponding reference signs are used in the figures, these reference signs denote the same or corresponding entities which can, in particular, be implemented or designed as described in connection with the relevant figure.
The machine controller CTL is configurable in a computer-aided manner and can be implemented as part of the machine M or totally or partially outside the machine M. The machine controller CTL is intended to be configured by embodiments of the invention in such a way that the machine M is controlled in an optimum manner. Control is also understood to mean regulation, and also an output and use of data or control signals which are control-related, i.e. which contribute to the targeted influencing of the machine M.
In the present exemplary embodiment, the machine controller CTL is intended to be configured by a machine learning method, in particular a genetic optimization method, in such a way that an operation of the machine M is optimized depending on the captured operating signals BS. The term optimization is generally also understood to mean an approximation to an optimum. The performance of the machine M is intended to be increased.
A control behavior of the machine controller CTL is defined by its configuration. The configuration comprises, in particular, an action execution tree, in the form of a behavior tree, which is known to a person skilled in the conventional art and by which actions and, in particular, sequences of actions to be executed by the machine M are specified. An action execution tree is represented by a directed graph which is linked in the form of a tree and which is implemented as a data structure linked in the form of a tree. A multiplicity of efficient control behaviors are known and available for the interpretation and execution of an action execution tree or the machine actions thereby specified.
As
The continuously captured operating signals BS of the machine M are transmitted by the sensor system S to the machine controller CTL. The operating signals BS can comprise information relating to operating states of the machine M, to positions or movements of components, to switching states, to control states, to control actions, to physical, chemical or electrical measured quantities and/or to other parameters relevant to the operation of the machine M.
Depending on the transmitted operating signals BS, control signals CS are generated by the machine controller CTL and are transmitted from the machine controller CTL to the machine M for the optimized control of the machine M. The control signals CS are generated depending on the optimized action execution tree BTO in such a way that the sequence of machine actions specified by the optimized action execution tree BTO is executed.
The machine controller CTL can form part of the configuration system KS or can be arranged totally or partially outside the configuration system KS. The configuration system KS and/or the machine controller CTL has/have one or more processors to carry out the method according to embodiments of the invention, and also one or more storage devices to store data to be processed. As already mentioned above, the machine controller CTL is intended to be configured by an optimized action execution tree BTO in such a way that the machine M is controlled in an optimized manner. The optimized action execution tree BTO is determined by the configuration system KS by a genetic optimization method.
Action patterns AP predefined by the configuration system KS and an initial action execution tree BTI in the form of a behavior tree are read in as a starting point for determining the optimized action execution tree BTO.
The action patterns AP can, in particular, specify sequences of machine actions and/or conditions for their execution. The action patterns AP are represented by partial graphs or partial trees of action execution trees or behavior trees. Idiomatic, easily interpretable, reliable, validated and/or permissible action patterns which have proven effective in performing partial control tasks can be predefined as action patterns AP. In the present exemplary embodiment, the action patterns AP are read in from a database DB or from an otherwise available action pattern library.
The initial action execution tree BTI is created by an expert USR according to technical specifications, requirements or other auxiliary technical conditions for controlling the machine M, and is fed into the configuration system KS. The initial action execution tree BTI specifies an execution of permissible and/or necessary machine actions, and therefore forms a reference for controlling the machine M or for a target configuration of the machine controller CTL. In this way, the initial execution tree BTI to some extent encodes expert knowledge relating to the control of the machine M.
Selection information SI is further fed by the expert USR into the configuration system KS.
A genetic optimization method, inter alia, is used in the present exemplary embodiment to determine an optimized action execution tree BTO. Advantageous action execution trees are searched for within a space of possible action execution trees and are further optimized. For this purpose, a generator GEN of the configuration system KS generates-on an at least partially random basis-a multiplicity of action execution trees BT which in each case specify a possible execution of machine actions by the machine M.
In a respective iteration of the generic optimization method, a fitness of the machine actions thereby specified is evaluated in each case for the generated action execution trees BT. Action execution trees BT having a higher fitness are used with a higher probability for generating further action execution trees BT in a subsequent iteration of the generic optimization method. Action execution trees BT having a lower fitness are excluded accordingly with a higher probability and/or are replaced by newly generated action execution trees. In this way, action execution trees BT having a higher fitness are increasingly generated by the generator GEN. In the present exemplary embodiment, a performance for controlling the machine M on the basis of the action execution tree BT to be evaluated, a tree size of this action execution tree BT and/or a weighted combination of the performance and the tree size are determined in order to determine a respective fitness. In this way, the generation of the action execution trees is driven in the direction of greater performance and smaller tree size. A multiplicity of standard methods are available for carrying out a genetic optimization of this type. Alternatively or additionally, other machine learning methods can also be used for the optimization.
In order to generate advantageous and/or permissible action execution trees BT, the initial action execution tree BTI and the action patterns AP are fed into the generator GEN. The action execution trees BT are then generated by the generator GEN on the basis of the initial action execution tree BTI and with at least partial use of the action patterns AP.
The generated action execution trees BT are transmitted from the generator GEN to an interpreter IP of the configuration system KS. The initial action execution tree BTI and the selection information SI are also fed into the interpreter IP.
The interpreter IP serves to execute action execution trees, wherein a sequence of machine actions specified by the action execution tree concerned is in each case determined. The machine actions which are intended for mandatory execution according to the technical specifications are selected by the selection information SI. In this way, a reference sequence RSQ of machine actions which are intended for mandatory execution is determined by the interpreter IP for the initial action execution tree BTI. A sequence SQ of machine actions thereby specified is further determined by the interpreter IP for each generated action execution tree BT. The reference sequence RSQ, the sequences SQ and the associated action execution trees BT are transmitted to a filter F of the configuration system KS.
The filter F serves, in particular, to filter the generated action execution trees BT. The sequence SQ determined for a respective action execution tree BT is compared with the reference sequence RSQ for this purpose. If the sequence SQ does not match the reference sequence RSQ, this sequence SQ and the action execution tree BT concerned are rejected and are not forwarded by the filter F. Conversely, if the sequence SQ matches the reference sequence RSQ, this sequence SQ and the action execution tree BT concerned are forwarded by the filter F. In this way, it can be ensured that the optimization remains restricted to action execution trees BT which meet predefined development objectives.
The action execution trees BT forwarded by the filter F and the action patterns AP are fed into an action pattern search module APS of the configuration system KS. The action pattern search module APS searches for the action patterns AP in each of the fed in action execution trees BT. If no action pattern AP is found in a respective action execution tree BT, this action execution tree is forwarded unchanged by the action pattern search module APS. Conversely, if an action pattern AP is found in an action execution tree BT, this action execution tree BT is modified by the action pattern search module APS.
Such a modification of an action execution tree BT is illustrated by way of example in
For the sake of clarity, modified action execution trees are denoted below with the same reference sign BT as unchanged action execution trees.
As
The complexity evaluator EVC serves the purpose of quantifying a tree size BG for a respective action execution tree BT. For this purpose, nodes and/or edges of the respective action execution tree BT are counted by the complexity evaluator EVC and/or the tree depth of the action execution tree BT is determined and a tree size BG is derived therefrom. A sum, weighted in each case by predefined weighting factors, of the number of nodes, number of edges and/or tree depth can be determined in each case and output as the tree size BG. The determined tree size BG is transmitted from the complexity evaluator EVC to a Pareto optimizer PO of the configuration system KS.
In addition, the action execution trees BT forwarded by the filter F are fed into a performance evaluator EVP of the configuration system KS and into the Pareto optimizer PO. The sequences SQ forwarded by the filter F are further fed into the performance evaluator EVP.
The performance evaluator EVP serves the purpose of quantifying, for a respective action execution tree BT, a performance of the machine M controlled by this action execution tree BT. In this way, a control performance of a machine controller CTL configured by this action execution tree BT is evaluated to a certain degree.
The performance to be determined can relate, in particular, to output, yield, reference value compliance, clock rate, product quality, speed, time requirement, runtime, precision, error rate, resource consumption, effectiveness, efficiency, pollutant emission, stability, wear, service life, physical behavior, mechanical behavior, chemical behavior, electrical behavior, an auxiliary condition to be observed and/or other target parameters which are to be optimized for the machine M which is to be controlled.
For the respective action execution tree BT and, in particular, for the associated sequence SQ of machine actions, the performance evaluator EVP determines a respective performance value PV which quantifies the performance of the machine M executing the sequence SQ. The performance evaluator EVP has a simulation model SIM of the machine M for this purpose. A behavior of the machine M induced by the respective action execution tree BT or by the respective sequence SQ of machine actions is simulated by the simulation model SIM for a multiplicity of time steps. A cumulated reward or accumulated yield of the simulated behavior is measured. A surrogate model, for example a data-driven model of the machine M which normally requires fewer computing resources than a detailed physical simulation, is used as a simulation model. A cumulated reward determined in this way can be output as the resulting performance value PV.
The performance value PV determined for the respective action execution tree BT is transmitted from the performance evaluator EVP to the Pareto optimizer PO.
The Pareto optimizer PO serves to perform a Pareto optimization and, in particular, to determine a Pareto front PF for the action execution tree BT evaluated by the complexity evaluator EVC and the performance evaluator EVP.
A Pareto optimization is a multi-criteria optimization in which a plurality of different objective criteria, referred to as Pareto objective criteria, are taken into account independently. A Pareto front PF is determined as a result of the Pareto optimization. A Pareto front of this type is also frequently referred to as a Pareto set. A Pareto front, here PF, comprises those solutions to a multi-criteria optimization problem in which one objective criterion cannot be improved without adversely affecting another objective criterion. A Pareto front therefore forms to some extent a set of optimum compromises. In particular, solutions not contained in the Pareto front can still be improved in terms of at least one objective criterion. Consequently, a multiplicity of solutions that are certainly not optimal are eliminated by a restriction to the Pareto front. Insofar as a Pareto front normally comprises only a very small part of a possible solution space, a subsequent selection effort or further optimization effort is substantially reduced through restriction to a Pareto front.
According to embodiments of the invention, a Pareto front PF is determined by the Pareto optimizer PO for the set of action execution trees BT, wherein an increase in performance and a reduction in tree size are used as Pareto objective criteria. A Pareto front PF should also be understood to mean a set of action execution trees of which the distance to a mathematically exact Pareto optimum, for example, falls below a given threshold value. The optimization is performed in the direction of greater performance and smaller tree size. A multiplicity of standard routines are available for Pareto optimizations of this type.
In the present exemplary embodiment, a Pareto front PF is determined by the Pareto optimizer PO within the generated action execution trees BT depending on the transmitted tree sizes BG and performance values PV of the generated action execution trees BT. An optimized action execution tree BTO is then selected or interpolated from the resulting Pareto front PF. If necessary, predefined selection criteria, in particular one or more further optimization criteria, can also be applied in the selection or interpolation of the optimized action execution trees BTO.
By restricting the selection, interpolation or a subsequent optimization to the Pareto front PF, a space of action execution trees to be processed is normally substantially restricted, wherein, in particular, non-optimal action execution trees are eliminated. The selection or interpolation of the optimized action execution tree BTO or further optimizations are thus substantially simplified.
Action execution trees BT(PF) of the Pareto front PF can optionally be fed into the genetic optimization method in order to thus enrich the genetic optimization method with advantageous action execution trees. A corresponding feedback of action execution trees BT(PF) of the Pareto front PF from the Pareto optimizer PO to the generator GEN is indicated by a dotted arrow in
The optimized action execution tree BTO is normally output in order to configure the machine controller CRL and/or is transmitted directly to the machine controller CTL in order to configure the latter for optimized control of the machine M. The resulting configuration of the machine controller CTL normally results in a control behavior which simultaneously offers both high performance and low complexity. The last-named characteristic results in configurations that can usually be substantially more readily interpreted and therefore more easily validated and and/or further developed by experts.
Although the present invention has been disclosed in the form of embodiments and variations thereon, it will be understood that numerous additional modifications and variations could be made thereto without departing from the scope of the invention.
For the sake of clarity, it is to be understood that the use of “a” or “an” throughout this application does not exclude a plurality, and “comprising” does not exclude other steps or elements.
Number | Date | Country | Kind |
---|---|---|---|
21186997.9 | Jul 2021 | EP | regional |
This application claims priority to PCT Application No. PCT/EP2022/068817, having a filing date of Jul. 7, 2022, which claims priority to European Application 21186997.9, having a filing date of Jul. 21, 2021, the entire contents both of which are hereby incorporated by reference.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/EP2022/068817 | 7/7/2022 | WO |