The invention relates to a vehicle data system and a method for determining transmission-suitable vehicle data from an environment-sensing sensor and can be used, for example, in the context of enhancing systems for assisted or automated driving.
The prior art includes the training of networks for detection functions and for vehicle functions based on training data which are run in, for example, by test vehicles during development. This restricts the data to scenarios which have occurred during the development. It is advantageous to collect data during the real operation of vehicles to secure comprehensive coverage of road traffic scenarios. This makes it possible to select training data from a wide-ranging diversity of data.
Data acquisition with test vehicles during development covers the scenarios acquired during the development. However, further scenarios occur during operation in production vehicles, which are insufficiently or scarcely covered by the scenarios during development. These are in particular edge cases (this means borderline cases, exceptional cases, individual cases or special cases which occur so infrequently that they are typically not acquired by test vehicles or only occasionally). In order to develop artificial intelligence (AI) which also addresses the edge cases, data acquisition during real operation is helpful to AI. In order to reduce the amount of data to be transmitted to subsequent development, it is necessary to select data in the vehicles. In this case, it is advantageous to keep the computing effort for the data selection as low as possible, since the budget for this on the embedded systems in the vehicles is limited.
Consequently, a method is advantageous which requires little computing time in order to assess the relevance of road traffic scenarios to the development of AI algorithms for ADAS and AD in production vehicles.
A general method for obtaining training data from production vehicles is described in WO 2020/056331 A1 for individual sensors.
Located in the vehicle is an artificial neural network which evaluates sensor data. A trigger classifier is applied to an intermediate result of the neural network in order to ascertain a classifier score for the sensor data. Based at least in part on the classifier score, a decision is made whether to transmit via a computer network at least a portion of the sensor data. In the event of a positive decision, sensor data are transmitted and used to generate training data.
It is an aspect of the present disclosure to provide an optimized possibility for efficiently collecting relevant data from vehicles.
One aspect relates to the identification of relevant samples and edge cases from vehicle fleets for data-based algorithmics or a data-driven optimization of machine learning systems and methods.
A vehicle data system according to the present disclosure includes an environment detection system and a monitor system in the vehicle.
The environment detection system is configured to receive and evaluate input data X from an environment-sensing sensor. The input data X are evaluated by means of a first trained artificial neural network K, and environment detection data Y′ are output as the result of the evaluation.
The monitor system is configured to evaluate the same input data X from the environment-sensing sensor by means of a second trained artificial neural network KR, and reconstruction data X′ are output as the result of the evaluation.
If the monitor system establishes a deviation exceeding a threshold value between the reconstruction data X′ and the input data X (→potential edge case), the monitor system prompts a transmission of the input data X to a separate data unit. The separate data unit can, for example, be an external server, cloud storage or a backbone of a V2X system. V2X, vehicle-to-X, means a communication system or telematics system in which the vehicle communicates with other participants.
In one exemplary embodiment, the second artificial neural network KR is an autoencoder.
The “non-safety-critical” (monitoring) system described is realized by an autoencoder. This is developed based on the same input data as the detection algorithm under consideration or function. Due to its functional principle, the autoencoder offers a significant advantage over possible other approaches.
This is illustrated using the example of image data as the sensor data:
If an image is given to the autoencoder, it tries to reconstruct the image in the output. Consequently, the autoencoder can be trained exclusively with the input signal without using additional labels. On the one hand, this offers the advantage that there is no additional outlay in annotation terms and, on the other hand, offers the decisive advantage that an error can be determined in a quantified manner at any time (even on unknown data) in relation to the input data. In this case, there is the possibility of suitable thresholding. The autoencoder can be applied to the entire image or to partial sections.
Consequently, the autoencoder can measure its own error with every possible input signal. In general, higher error values are to be expected with machine learning methods in input data which do not sufficiently match the training data. The fact that the autoencoder and the detection function are based on the same data results in the following advantage: unknown or uncertain scenarios identified by the autoencoder indicate that these scenarios are insufficiently contained in the training data and, consequently, are relevant to a wide-reaching coverage of traffic scenarios in function development.
According to one embodiment, the monitor system calculates a score which estimates a relevance of the input data X based on the deviation.
In one exemplary embodiment, the first artificial neural network K has been trained on the basis of predefined training data, wherein nominal output data Y_1, Y_2, . . . , Y_n have been used in each case regarding input data X_1, X_2, . . . , X_n. A first error function has been minimized by adjusting weights of the first neural network K, which indicates deviations between outputs of the first neural network K for input data X_1, X_2, . . . , X_n from corresponding nominal output data Y_1, Y_2, . . . , Y_n.
According to one embodiment, the second artificial neural network KR has been trained by adjusting weights of the second neural network KR, wherein a second error function has been minimized, which indicates the deviation between reconstruction data X′ and input data X from the environment-sensing sensor.
In one exemplary embodiment, in addition to the input data X from the environment-sensing sensor, meta information is transmitted. Meta information corresponds to one or more items of information from the following group: current software version, calculated scores of the monitor system, GPS data, date, time, vehicle identification number (VIN) and cloud data, which make it possible to reproduce the scene and/or the vehicle situation.
Based hereon, the scene can be precisely reconstructed in development, for example (simulation with the same software version). Thanks to this information gain, unknown scenarios and edge cases can be selected and can flow directly into the development process of the vehicle or environment detection function. As a result, a continual quality assurance process can be established. With each development step, more and more relevant data flow into the system.
Additionally, the degree of maturity of the software can also be derived from the number of incoming data transmissions. The lower the number of data transmission processes by virtue of insufficiently accurate predictions, the higher the degree of maturity of the software.
According to one embodiment, the environment detection system is configured to receive input data X from a plurality of environment-sensing sensors and to evaluate these jointly.
This corresponds to a multi-sensor setup. The advantage of multi-sensor systems is that they increase the certainty of detection algorithms for road traffic by verifying the detections of multiple environment-sensing sensors. For example, multi-sensor systems can be any combination of:
In this case, a multi-sensor system includes at least two sensors. The data acquisition of one of these sensors s at time t can be designated with D_(s,t). The data D can be images and/or audio recordings, as well as measurements of the angle, distance, speed and reflections of objects in the environment.
In one exemplary embodiment, the monitor system is configured to process the input data X from the environment-sensing sensor parallel to the environment detection system.
According to one embodiment, the monitor system is integrated in the environment detection system as an additional detector head. The monitor system and the environment detection system utilize a common encoder.
In one exemplary embodiment, the input data from the environment-sensing sensor are image data. The monitor system is configured to reconstruct the entire image or a partial section of the image. Additionally, it can estimate and output a value for the error of the reconstruction.
According to one embodiment, the monitor system is configured to ascertain and output an uncertainty measure. The uncertainty measure indicates how certain the monitor system is during the output of its reconstruction data X′.
In one exemplary embodiment, the monitor system is configured to take account of a temporal consistency of the reconstruction data X′.
According to one embodiment, a distinction is made as to whether a deviation between the reconstruction data X′ from the monitor system and the input data X occurs continually or only for limited periods of time.
In one exemplary embodiment, the environment detection system and the monitor system are configured in such a way that both are capable of being updated over the air.
A further subject-matter of the present disclosure relates to a method for determining transmission-suitable vehicle data from an environment-sensing sensor. The method includes the following steps:
Furthermore, the assessment of the relevance of traffic scenarios can be improved by taking account of multi-sensor setups, the temporal progress, and the estimation of certainty with which a network predicts an output. Furthermore, a low computing time outlay for embedded systems is advantageous.
These points are addressed, e.g., by choosing a monitor system which requires little computing effort or an autoencoder approach and expanding the latter for multi-sensor systems to include confidence estimates and temporal verification.
Exemplary embodiments and figures are explained in greater detail below, wherein:
The data system 10 is electrically connected to at least one environment-sensing sensor 1, e.g., an image acquisition device, in a vehicle 2. The image acquisition device can be a front camera of a vehicle. The front camera serves as a sensor for acquiring the environment which lies in front of the vehicle. The environment of the vehicle 2 can be detected based on the signals or image data from the front camera. Based on the environment detection, ADAS or AD functions can be provided by an ADAS/AD control unit, e.g., lane recognition, lane departure warning system, traffic sign recognition, speed limit assistance, road user recognition, collision warning, emergency braking assistance, adaptive cruise control, construction site assistance, a highway pilot, a Cruising Chauffeur function and/or an autopilot.
The image acquisition device typically includes an optical system or a lens and an image acquisition sensor, e.g., a CMOS sensor.
The data or signals sensed by the environment-sensing sensor 1 are transmitted to an input interface 12 of the data system 10. The data are processed in the data system 10 by a data processor 14. The data processor 14 includes an environment detection system 16 and a monitor system 15. The environment detection system 16 can include a first artificial neural network, for example a CNN. In addition to pure detections, the environment detection system 16 can also create a more comprehensive understanding of the environment and situation, for example a prediction of trajectories of the ego vehicle 2 but also of other objects or road users in the environment of the vehicle 2. The detections of the environment detection system 16 can be relevant to safety because actions or warnings of an ADAS or AD system of the vehicle are dependent on them. On the other hand, the monitor system 15 is not safety-critical, since its main task is to monitor the environment detection system 16 and to decide whether data should be transmitted to a separate data unit 20 or not. The monitor system 15 can include a second artificial neural network, for example an autoencoder.
In order for the artificial neural networks to process the data in the vehicle in real time, the data system 10 or the data processor 14 can include one or more hardware accelerators for artificial neural networks.
If there is a deviation exceeding, e.g., a threshold value between the detection of the monitor system 15 and the detection of the environment detection system 16, then the data system 10 carries out a wireless transmission of the data to a separate data unit 20 (cloud, backbone, infrastructure, etc.) via an output interface 18.
The classification system K classifies, e.g., objects on the basis of sensor data X from the environment-sensing sensor 1. In addition to the classification system K, a second independent and additional monitor system KR is introduced.
In order to train the classification system K which is, for example, a decision-tree learning system, a support vector machine, a learning system based on regression analysis, a Bayesian network, a neural network or a convolutional neural network, training input data X and training target values Y are provided. Output data Y′ are generated from the training input data X by means of the classification system K. Reconstruction data X′ which are similar to the training input data X are generated from the training input data X by means of the monitor system KR (autoencoder). The objective of the training of the classification system K is that the output data Y′ are as similar as possible to the training target values Y without carrying out overfitting.
To this end, the deviations still present between the output data Y′ and the training target values Y are determined from the generated output data Y′ and the training target values Y by means of a first error function loss. These deviations are used, for example, for an adjustment of parameters of the classification system K via backpropagation. This is repeated until such time as a predefined match has been attained or until signs of overfitting occur.
The monitor system KR is based on an autoencoder and does not therefore need any further annotations apart from the actual sensor data X. The objective of the training of the monitor system KR is that the reconstruction data X′ are as similar as possible to the training input data X. To this end, the deviations still present between the reconstruction data X′ and the training input data X are determined from the generated reconstruction data X′ and the training input data X by means of a second error function loss2. These deviations are used, for example, for an adjustment of parameters of the monitor system KR via backpropagation. This is repeated until such time as a predefined match has been attained or until signs of overfitting occur.
The monitor system KR can also be trained, for example, following the training of the classification system K, wherein it must be noted that the same training input data X must be used.
The autoencoder or the monitor system KR can compare its output regarding the original sensor data at any time and calculate (by means of a metric) a quantified error or a quantified uncertainty U. The deviation between the reconstruction data X′ and the input data X is ascertained by means of a metric. The deviation value ascertained in this way quantifies the uncertainty U of the output data Y′.
The advantage of this principle is that the monitor system KR can itself measure the error in its reconstruction data X′ in relation to the input signal or the input data X in the application (inference) by means of the metric, and can do so at any time.
Since both the classification system K and the monitor system KR have been developed on the same data, both systems have comparable deficits. However, since the deficits in the output data Y′ in the case of the classification system K can no longer be measured in the application, they can be identified from the relation with the monitor system KR.
By way of example, let the input data X be image data of an image and the output data Y′ be classification data which correspond to objects depicted in the image. If it is now possible to generate a reconstructed image X′ by means of the monitor system KR, which is similar to the original image X, then this indicates that similar input data X were already present during the training of the monitor system KR and the classification system K so that the uncertainty U of the output data Y′ is small. If, however, the reconstructed image X′ is very different from the original image X, then this indicates that no similar input data X have been used to train the monitor system KR and the classification system K and, accordingly, the uncertainty U of the output data Y′ is large.
Consequently, a large uncertainty U is an indication that the incoming or input data could be an edge case.
A multiplicity of data which is not adequately represented in the training data can be identified with this system. Data which are not adequately represented can be caused by limitations of the environment-sensing sensor in particular driving situations, but also simply by unusual environment scenarios.
The modern work of art consists of multiple traffic lights. These do not serve to regulate traffic. Camera-based traffic light recognition would be overwhelmed by assigning traffic-regulating information to this work of art.
Further aspects and exemplary embodiments are described below:
The transition from assisted to autonomous driving constitutes a technical hurdle. An autonomous system also has to master complex scenarios which have possibly not been covered by the simulation or test runs. In this case, the objective is that the surroundings sensing works reliably for vehicle functions at all times and in as many scenarios as possible-ideally all scenarios.
In order to solve this problem, we propose an additional monitor system, that is to say a non-safety-critical system which automatically obtains these complex and relevant data for development from real road traffic. Since the “relevant” data can shift continually with each software version, it is moreover advantageous that the system is capable of being updated and additionally has versioning regarding the raw data. In detail, the appearance of the overall system is as follows:
In this case, the proposed monitor system considers a sensor setup for vehicles in the context of assisted and autonomous driving. This can optionally be extended to a multi-sensor setup. The advantage of multi-sensor systems is that they increase the certainty of detection algorithms for road traffic by verifying the detections of multiple sensors. In this case, multi-sensor systems can be any combination of:
In this case, a multi-sensor system consists of at least two sensors. The data acquisition of one of these sensors s at time t is designated below with D_(s,t). The data D can be images and/or audio recordings, as well as measurements of the angle, distance, speed and reflections of objects in the environment.
The described monitor system (or not-safety-critical system) is realized by an autoencoder. This is developed based on the same data as the detection algorithm under consideration or function. The autoencoder can be realized in different ways. One possibility is the parallel use of an autoencoder in addition to the detection algorithm, another possibility is the realization as an additional detector head, i.e., the use of a common encoder for detection and autoencoder. Use as a subsequent system would also be conceivable, that is to say a monitor system which starts after the environment has been detected.
Thanks to its functional principle, the autoencoder offers a significant advantage over other approaches. This is illustrated using the example of image data:
If an image is given to the autoencoder, it tries to reproduce the image in the output. Consequently, the autoencoder can be trained exclusively with the input signal without using additional labels. On the one hand, this offers the advantage that there is no additional outlay in annotation terms and, on the other hand, offers the decisive advantage that an error can be determined in a quantified manner at any time (even on unknown data) in relation to the input data. In this case, there is the possibility of suitable thresholding. The autoencoder can be applied to the entire image or to partial sections.
Consequently, the autoencoder can measure its own error with every possible input signal. In general, higher error values are to be expected with machine learning methods in input data which do not sufficiently match the training data. The fact that the autoencoder and the detection function are based on the same data results in the following advantage: unknown or uncertain scenarios identified by the autoencoder indicate that these scenarios are not adequately contained in the training data and, consequently, are relevant to a wide-reaching coverage of traffic scenarios in function development.
In addition to the autoencoder's output, which estimates the reconstruction certainty of the network, a measure of the certainty with which the autoencoder makes its decision is, furthermore, relevant. This so-called certainty measure is a supplement to the autoencoder output and is in particular relevant if the fusion of the autoencoder output via different sensors is considered and/or the temporal fusion of the autoencoder outputs.
Such a certainty measure can be calculated via statistical calibration or uncertainty estimation. The following are suitable for this:
c) Furthermore, a measurement uncertainty can be estimated, e.g., by adding a regularization to the error function (Loss; Loss2), which measures the measurement uncertainty at runtime.
The extension in order to ascertain unknown or uncertain scenes by an uncertainty estimate makes it possible to assess scenes in which the classifier does indeed make a correct decision, but this decision is associated with a large degree of uncertainty. By adding the uncertainty estimate, the robustness of the search for unknown scenes is extended by the search for uncertain scenes. Consequently, both unknown scenes and scenes uncertain for the network can be selected with this architecture.
As described above, the autoencoder offers the possibility of identifying data which are insufficiently illustrated in the training data. Combining this principle with the idea of temporal consistency results in further added value. Thus, if the identified data are filtered in terms of time and a distinction is made between continually occurring sample data and only isolated outliers, valuable additional information is obtained.
Fleeting outliers could thus, e.g., indicate sensory causes. Thus, when entering a tunnel, the white balance of the camera could fluctuate greatly for individual points in time t. These data would be of relevance to the respective sensor development.
If the identified data occur continually, there is a high probability that it is an unknown scenario which is of relevance to the algorithm development.
The following advantages result from the indicated embodiments and aspects:
Number | Date | Country | Kind |
---|---|---|---|
10 2021 214 334.2 | Dec 2021 | DE | national |
The present application is a National Stage Application under 35 U.S.C. § 371 of International Patent Application No. PCT/DE2022/200295 filed on Dec. 12, 2022, and claims priority from German Patent Application No. 10 2021 214 334.2 filed on Dec. 14, 2021, in the German Patent and Trademark Office, the disclosures of which are herein incorporated by reference in their entireties.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/DE2022/200295 | 12/12/2022 | WO |