The disclosure relates to an operating method for an autonomously operatable device.
Devices that are operatable autonomously are used and are to be used increasingly in the future, for example to replace an operator of the device or to be able to at least temporarily relieve them. For example, such a device that is operatable autonomously can be vehicles, for example passenger cars, aircraft, trucks, or also production robots. In order to be able to operate such devices autonomously as flexibly as possible, they are usually controlled by machine learning algorithms. These algorithms are optionally learned (“trained”) during a learning or training phase in order to achieve the desired results in as many different situations as possible, i.e., to make appropriate decisions that the user considers to be correct. In this case, the “learning” of the algorithms is usually completed before the device is actually put into operation. Optionally, algorithms can also be used that (continue to) learn during real operation.
A disadvantage of these machine learning algorithms, however, is that the decisions made are of a probabilistic nature, i.e., based on probabilities. Since it is therefore difficult to estimate in which situations such machine learning algorithms make wrong decisions, their use for safety-critical applications, for example autonomous driving of vehicles in road traffic, is problematic.
DE 10 2016 009 655 A1 discloses, for example, to use two machine learning algorithms, in particular two neural networks, in a vehicle in order to generate two separate decisions relating to the operation of the vehicle. These are then compared with one another and, in the event of unequal decisions, one of the two decisions is checked with regard to compliance with ethical and/or security criteria. If this decision fulfills these criteria, the procedure continues according to this decision; otherwise the other decision is chosen.
The disclosure relates to an operating method for an autonomously operatable device. Furthermore, the disclosure relates to a device that is autonomously operatable and is operated in particular according to the operating method.
Devices that are operatable autonomously are used and are to be used increasingly in the future, for example to replace an operator of the device or to be able to at least temporarily relieve them. For example, such a device that is operatable autonomously can be vehicles, for example passenger cars, aircraft, trucks, or also production robots. In order to be able to operate such devices autonomously as flexibly as possible, they are usually controlled by machine learning algorithms. These algorithms are optionally learned (“trained”) during a learning or training phase in order to achieve the desired results in as many different situations as possible, i.e., to make appropriate decisions that the user considers to be correct. In this case, the “learning” of the algorithms is usually completed before the device is actually put into operation. Optionally, algorithms can also be used that (continue to) learn during real operation.
A disadvantage of these machine learning algorithms, however, is that the decisions made are of a probabilistic nature, i.e., based on probabilities. Since it is therefore difficult to estimate in which situations such machine learning algorithms make wrong decisions, their use for safety-critical applications, for example autonomous driving of vehicles in road traffic, is problematic.
DE 10 2016 009 655 A1 discloses, for example, to use two machine learning algorithms, in particular two neural networks, in a vehicle in order to generate two separate decisions relating to the operation of the vehicle. These are then compared with one another and, in the event of unequal decisions, one of the two decisions is checked with regard to compliance with ethical and/or security criteria. If this decision fulfills these criteria, the procedure continues according to this decision; otherwise the other decision is chosen.
The disclosure is based on the object of allowing the most secure possible autonomous operation of a device.
This object is achieved according to the disclosure by an operating method for an autonomously operatable device with the features in accordance with claim 1. Furthermore, this object is achieved according to the disclosure by an autonomously operatable device having the features in accordance with claim 9. Further advantageous and in part inventive embodiments and developments of the disclosure are set out in the subclaims and the following description.
The operating method according to the disclosure is used to operate an autonomously operatable device. According to the method, in an autonomous operating mode of the device, sensor data relating to a current surroundings condition of the device are detected using at least one sensor device assigned in particular to the device. These sensor data are supplied to a control algorithm that is implemented as a machine learning algorithm and which learns in a self-contained manner. The control algorithm estimates the current surroundings condition of the device on the basis of the sensor data. The control algorithm preferably analyzes and classifies the surroundings condition. The control algorithm then makes a control decision as a result inferred from the assessment (i.e., in particular the analysis and classification) (which is directed in particular to the further operation of the device). Using a monitoring algorithm that is independent of the control algorithm, a quality (hereinafter also referred to as “result quality”) relating to the control decision is ascertained. Depending on the ascertained result quality, the device is then operated according to the control decision (in particular is continued to be operated) or the control decision is rejected and the device is set to a secure operating state.
Thus, in the intended autonomous operating mode, two independent algorithms are used, the algorithm used to monitor the other algorithm being designed in such a way that it does not output a result that is parallel to the other algorithm (in particular that is directed towards the same target), but said algorithm preferably only ascertains how high the probability is that the result ascertained by the other algorithm is “reliable” or “correct.”
“Learns in a self-contained manner” is understood in this case and in the following in particular to mean that the machine learning control algorithm no longer learns in the intended, i.e., real or actual, autonomous operation. Its “decision parameters” are therefore unchangeable or fixed in actual operation; i.e., after the training or learning phase.
Because the control algorithm and the monitoring algorithm are designed independently of one another and are aimed at different goals (namely the finding of a control decision and the assessment of the result quality), a comparatively secure decision can advantageously be made relating to the further operation of the device. This is because the monitoring algorithm is preferably not designed to check the measure contained in the control decision, but rather to output, independently of the measure contained, in particular a probability as to whether the result containing the control decision can be fundamentally correct. Because the control algorithm learns in a self-contained manner, the operating method can also be carried out in a comparatively conservative manner, in particular with regard to storage and computing capacity, since resources for the ongoing learning process can be saved. In addition, an algorithm which learns in a self-contained manner can be monitored comparatively easily, since its behavior cannot change in an unforeseen way due to a continued learning process.
In the scope of the operating method, the decision regarding the implementation of the control decision or the transfer of the device to the safe operating state is preferably made by means of a particularly deterministic “decision algorithm.” This is preferably implemented independently of the control and monitoring algorithms. In a simple variant, this decision algorithm carries out, in particular, a threshold value comparison of the ascertained result quality and, if the value falls below the threshold value, initiates the secure operating state.
In a preferred variant of the method, a machine learning but preferably algorithm which learns in a self-contained manner is used as the monitoring algorithm. This in turn contributes to the conservation of resources described above. Furthermore, the already learned behavior cannot be changed by a continued learning process, which could lead to unforeseen results.
In an optional variant of the method, for example, at least one camera—in particular pointing in the direction of travel—is used as the sensor device for optical detection of the sensor data. Additionally or alternatively, radar sensors and/or other proximity sensors, optionally also pressure sensors, are used for detecting the sensor data.
In a further preferred variant of the method, the monitoring algorithm is learned to recognize whether the current surroundings condition of the device is contained in learning data (also: “training data”) of the control algorithm and to infer the result quality therefrom (i.e., in particular a probable flaw of the control decision). In the intended, autonomous operating mode of the device, the monitoring algorithm thus ascertains whether the current surroundings condition of the device is included in the training data, and infers the result quality therefrom. The monitoring algorithm preferably reduces the result quality (in particular its value) if the surroundings condition characterized by the sensor data (in particular the “scenario” resulting therefrom) is unknown; i.e., not contained in the learning data of the control algorithm. One assessment criterion for the result quality is therefore the familiarity with the surroundings condition. If the current surroundings condition is not adequately represented by the learning data and thus cannot be sufficiently derived from the learning data (or: “training scenarios”) by the control algorithm (or if the surroundings condition is not sufficiently comparable with said scenarios), the monitoring algorithm thus infers that the control algorithm may in specific circumstances come to a wrong decision (and thus not to a correct decision with sufficient security). In this case, the result quality is reduced.
In an additional or alternative, expedient variant of the method to the variant of the method described above, the monitoring algorithm is learned to ascertain a measure for an occupancy of a system resource of a first controller on which the control algorithm is processed and to infer the result quality based on this measure. In the intended, autonomous operating mode of the device, the monitoring algorithm thus ascertains this measure and infers the result quality therefrom. As a measure for the system resource of the controller, for example, a computing time is ascertained for which the control algorithm occupies the controller, in particular a microprocessor of the controller, i.e., how long the control algorithm needs to calculate (i.e., to make) the control decision. If the computing time exceeds a period of time of 100 milliseconds that is usual for making the control decision, the monitoring algorithm infers that an unusual situation is present (also referred to as “exceptional situation”) and accordingly reduces the quality of the decision. The occupancy of a main memory (in particular a portion of it) of the first controller is ascertained as an additional or alternative measure. If the control algorithm occupies a comparatively large portion of the available main memory (in particular in comparison to usual calculation processes), this also indicates an exceptional situation. Optionally, the monitoring algorithm also uses the sensor data supplied to the control algorithm, in particular to assess whether the sensor data have changed so slightly compared to a previous situation (i.e., in particular compared to the previous control decision) that increased resource expenditure is not to be expected. The monitoring algorithm is thus optionally learned to estimate the system resources that are likely to be requested or to be occupied by the control algorithm and, on this basis, to assess the actual degree of occupancy.
In an expedient method variant of the operating method, which also represents an independent disclosure, the sensor data relating to the current surroundings condition are first detected using the at least one sensor device and supplied to the control algorithm. The operator then uses the sensor data to assess the current surroundings condition and makes the control decision. A decision methodology of the control algorithm is selected in this case in such a way that a course of individual decisions leading to the control decision is disclosed; i.e., in particular, can be comprehended. During the learning phase of the control algorithm, the respective output control decision is checked for correctness (in particular for conformity with the surroundings situation selected and supplied for learning). For example, it is checked whether an obstacle, in particular, is correctly recognized and whether a corresponding decision is made to prevent a collision (namely the control decision); for example, slowing down the current movement, an evasive maneuver or the like. In the event of an error—i.e., if no corresponding (expected) decision is made—the course of the individual decisions is examined for input from parts of the sensor data (for example, individual data points) forming the basis of the wrong decision of the control algorithm. Such error-related parts of the sensor data are then filtered in the (in particular real) autonomous operating mode, i.e., preferably not supplied to the control algorithm or not taken into consideration by it. This variant of the method is basically independent of the monitoring algorithm described above.
In a further expedient variant of the method, the monitoring algorithm is learned to recognize in particular whether the sensor data supplied to the control algorithm form at least partially (for example in the form of individual data points) a basis for a wrong decision by the control algorithm. In particular, this variant of the method is at least partially combined with the above method variant in that, as described above, the error-related parts of the sensor data are identified during the learning phase of the control algorithm and supplied to the monitoring algorithm for training with regard to the estimation of the result quality. In other words, the monitoring algorithm is preferably (optionally additionally) trained to recognize these error-related parts of the sensor data, and is thus designed to use the detection of such error-related parts of the sensor data to infer the result quality, in particular to reduce the result quality if such parts of the sensor data which are known to be leading to errors can be identified.
In a preferred variant of the method, the monitoring algorithm is implemented by a model that is different from the control algorithm. This means that the monitoring algorithm is based on a different machine learning algorithm, in particular a different “learning method.” For example, a decision tree model (for example “boosted decision tree,” “decision forests,” or “random forests”) is used for the control algorithm and a neural network or the like is used for the monitoring algorithm. This further promotes the independence of the two algorithms from one another. The development of the control algorithm by the decision tree model is also advantageous with regard to the disclosure of the course of the decisions described above. In this case, the course of the individual decisions can be comprehended particularly easily using the respective intermediate steps.
In an expedient variant of the method, the device has the above-mentioned first controller and a second controller which is independent of this and which are preferably designed independently of one another in terms of hardware. The control algorithm is processed by the first controller and the monitoring algorithm is processed by the second controller. This allows the two algorithms to be processed separately in a particularly simple manner and advantageously prevents mutual interference. In addition, a simultaneous influencing of both algorithms by a hardware error in a shared controller is avoided. Optionally, the device also has a third (independent) controller on which the decision algorithm described above is implemented and processed.
In an optional variant of the method, a further machine learning additional algorithm, but preferably an additional algorithm which learns in a self-contained manner is used, in particular for the case that the result quality from the monitoring algorithm is assessed as bad, i.e., particularly low. For example, this additional algorithm is used to find an alternative control decision based on the sensor data. In this case, a further monitoring algorithm (in particular analogous to the “first” monitoring algorithm described above) is preferably processed in order to ascertain the result quality of the additional algorithm or a combination of the control algorithm and the additional algorithm.
At least one further (in particular independently implemented) monitoring algorithm is optionally used to ascertain the result quality. Optionally, the respective ascertained result qualities are averaged. In this way, the accuracy of the ascertained result quality can advantageously be increased.
In a preferred variant of the method, the device is a motor vehicle, in particular a passenger motor vehicle. The motor vehicle is set to the secure operating state in that a reduction in the driving speed is preferably initiated. Optionally, the driving speed is “only” reduced and, if necessary, the control is transferred to a driver of the vehicle; preferably in response to a corresponding warning. Alternatively (or, if necessary, if the driver does not take over control), the driving speed is reduced until the motor vehicle stops and the motor vehicle is parked if necessary (on a motorway, for example on a hard shoulder).
The autonomously operatable device according to the disclosure, which is formed in particular by the motor vehicle (alternatively, for example, by another vehicle or by an industrial robot), comprises the at least one sensor device for detecting the sensor data relating to the current surroundings condition of the device. In addition, the device comprises an operating controller which is designed to carry out the operating method described above, in particular automatically. In other words, the operating controller is designed to feed the sensor data to the control algorithm, to assess the current surroundings condition of the device using the control algorithm on the basis of the sensor data and to make the control decision. Furthermore, the operating controller is designed to ascertain the result quality of the control decision using the monitoring algorithm which is independent of the control algorithm and, depending on the ascertained result quality, to operate the device according to the control decision (in particular, to continue to operate) or to reject the control decision and to set the device to a secure operating state.
In a preferred embodiment, the operating controller has the first controller and the second controller that is independent of it and that are each designed separately from one another in terms of hardware. The control algorithm is implemented on the first controller and the monitoring algorithm is implemented on the second controller.
In a preferred embodiment, the operating controller is at least substantially formed by a microcontroller with a processor and a data storage, in which the functionality for performing the operating method according to the disclosure is implemented in the form of operating software (firmware). In this case, the operating method is carried out automatically when the operating software is executed in the microcontroller. Alternatively, the operating controller is formed by a non-programmable electronic component, for example an ASIC. In this case, the functionality for performing the operating method according to the disclosure is implemented in terms of circuitry. The first and second controllers and, if applicable, the third controller, are each designed as hardware-independent (sub)controllers of the operating controller and preferably analogous to the operating controller, in particular by a microcontroller, each with a processor and a data storage, on which the respective algorithms are implemented as mutually independent software components.
The conjunction “and/or” is to be understood in this case and in the following in particular in such a way that the features linked by means of this conjunction can be designed both together and as alternatives to one another.
Embodiments of the disclosure are explained in more detail below with reference to a drawing
Corresponding parts (and variables) are always provided with the same reference signs in all drawings.
In
The operating controller 2 comprises a first controller 6 on which a control algorithm 8 is implemented so that it can run. Furthermore, the operating controller 2 comprises a second controller 10, on which a monitoring algorithm 12 is implemented so that it can run. In addition, the operating controller 2 comprises a third controller 14 on which a decision algorithm 16 is implemented so that it can run. The operating controller 2 further comprises an actuator controller 18 which is designed to control actuators of the motor vehicle 1, specifically a traction motor, brakes, and a steering of the motor vehicle 1.
In an autonomous (driving) operating mode, the operating controller 2 carries out the operating method shown in
The control algorithm 8 is learned (or: trained) to assess the surroundings situation, i.e., the surroundings condition of the motor vehicle 1, in a method step 30 on the basis of the sensor data D. In other words, the control algorithm 8 derives an “image” (or: “scenario”) of the surroundings situation from the sensor data D, ascertains whether there are any obstacles located in the future movement path (indicated by a wall 32 in
In a method step 40, the monitoring algorithm 12 ascertains a result quality G of the control decision E. The result quality G reflects a probability of whether the control decision E is correct or incorrect. For this purpose, the operating controller 2 feeds the sensor data D to the second controller 10. In an optional variant of this embodiment, the control decision E is also supplied to the second controller 10. The monitoring algorithm 12 is designed according to a method that deviates from the control algorithm 8, in this case, specifically, as a neural network. The control algorithm 8 has been completely trained to recognize whether the specific surroundings situation (the current scenario) that can be derived from the sensor data D can be mapped by the training data (training scenarios) by means of which the control algorithm 8 was trained. If this is the case, the monitoring algorithm 12 sets the result quality to a high value, which indicates that there is a high probability that the control decision E is correct. However, if the training data does not map the surroundings situation characterized by the sensor data D, the monitoring algorithm 12 assumes that the control algorithm 8 cannot achieve a correct result with high reliability based on the sensor data D and accordingly sets the result quality G to a low value.
In an optional embodiment, the monitoring algorithm 12 ascertains the result quality G on the basis of occupied system resources, specifically the computing time and/or the occupied portion of the main memory of the first controller 6. If the computing time and/or the occupied portion of the main memory exceeds a specific value, the monitoring algorithm 12 sets the result quality G to a low value.
The result quality G and the control decision E are supplied to the third controller 14 and thus to the decision algorithm 16 in a method step 50. The decision algorithm 16 is designed to be deterministic and uses the result quality G to decide whether the control decision E should be carried out or whether the risk that the control decision E will lead to a critical situation is too high. In the latter case, the decision algorithm 16 decides to convert the motor vehicle 1 to a secure operating state, in that the driving speed is reduced and the motor vehicle 1 is parked. In both cases, the decision algorithm 16 outputs a control command B to the actuator controller 18. If the result quality G is high, this control command B contains the control decision E; otherwise, it contains corresponding instructions to establish the secure operating mode. In order to carry out the control command B, the actuator controller 18 translates the control command B into corresponding control signals S directed to the respective actuator.
The subject matter of the disclosure is not limited to the embodiment described above. Rather, further embodiments of the disclosure can be derived from the above description by a person skilled in the art.
Number | Date | Country | Kind |
---|---|---|---|
10 2018 206 712.0 | May 2018 | DE | national |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/EP2019/061068 | 4/30/2019 | WO |
Publishing Document | Publishing Date | Country | Kind |
---|---|---|---|
WO2019/211282 | 11/7/2019 | WO | A |
Number | Name | Date | Kind |
---|---|---|---|
9274525 | Ferguson et al. | Mar 2016 | B1 |
9569404 | Nakamura | Feb 2017 | B2 |
9606538 | Kopetz | Mar 2017 | B2 |
10055652 | Myers et al. | Aug 2018 | B2 |
10782700 | Kopetz | Sep 2020 | B2 |
10919524 | Poledna | Feb 2021 | B2 |
11340892 | Poledna | May 2022 | B2 |
11663370 | Heersink | May 2023 | B2 |
20130173767 | Nakamura | Jul 2013 | A1 |
20170369074 | Mathes et al. | Dec 2017 | A1 |
20180089563 | Redding et al. | Mar 2018 | A1 |
20190100195 | Rothhamel | Apr 2019 | A1 |
20210039669 | Watson | Feb 2021 | A1 |
20210237763 | Berger | Aug 2021 | A1 |
Number | Date | Country |
---|---|---|
102014018913 | Jun 2016 | DE |
102015007242 | Dec 2016 | DE |
102016009655 | Apr 2017 | DE |
102017105903 | Sep 2017 | DE |
102016205780 | Oct 2017 | DE |
102016207276 | Nov 2017 | DE |
Entry |
---|
International Preliminary Report on Patentability directed to related International Patent Application No. PCT/EP2019/061068, dated Nov. 3, 2020, with attached English-language translation; 20 pages. |
International Search Report and Written Opinion of the International Searching Authority directed to related International Patent Application No. PPCT/EP2019/061068, dated Jul. 25, 2019, with attached English-language translation; 28 pages. |
Number | Date | Country | |
---|---|---|---|
20210237763 A1 | Aug 2021 | US |