The disclosure relates to a sensing apparatus. More specifically, the disclosure relates to a sensing apparatus and a method for collecting and auto-labelling measurement data in a traffic scenario involving one or more vehicles.
Autonomous self-driving is being deployed by several car manufacturers. A self-driving vehicle comprises sensors such as cameras, radio detection and ranging (radar) sensors, light detection and ranging (lidar) sensors, Global Positioning System (GPS) sensors and the like. These sensors create large amounts of data.
Lidar and radar sensors usually generate un-labelled raw point cloud data that needs to be processed by various algorithms for, among other purposes, object detection and recognition. Developing and evaluating the performance of such algorithms may be involve the use of ground truth information of each point cloud. A labelled point cloud may be used to determine whether a given point of the point cloud is associated with, for instance, a car, bus, pedestrian, motorcycle or another type of object. Simulated environments based on mathematical models do not fully reflect the real reflectivity properties of surfaces when a radar or lidar based algorithm is evaluated.
Therefore, radar or lidar based algorithms are assessed with a labelled point cloud in order to ensure an objective performance evaluation, without having to rely only on the human perception for evaluation and comparison. Thus, in a traffic scenario it is a challenge to collect and generate a labelled point cloud dataset captured through radar or lidar sensors in an automated manner, and generate a ground truth information necessary for objectively evaluating the performance of a radar or lidar based algorithm.
In conventional approaches to point cloud processing, the performance evaluation is based on the human eye by comparing detected objects to a camera feed.
Stephan Richter et al., “Playing for Data: Ground Truth from Computer Games”, TU Darmstadt and Intel Labs, 2016, (link:http://download.visinf.tu-darmstadt.de/data/from_games/) discloses using a labelled point cloud dataset, where the data are synthetized from a computer game and where the ground truth and identity of each object is generated from the simulator. Then, based on mathematical models, a radar or lidar point cloud is generated from the identified objects in order to develop an appropriate algorithm for each sensor type (see Xiangyu Yue et al. “A LiDAR Point Cloud Generator: from a Virtual World to Autonomous Driving”, June 2018; https://parnsf gov/servlets/purl/10109208). Furthermore, algorithms for self-driving cars are also tested using the simulated environment provided by a computer game (see Mark Martinez, “Beyond Grand Theft Auto V for Training, Testing and Enhancing Deep Learning in Self Driving Cars”, A MASTER'S THESIS, PRINCETON UNIVERSITY, June 2018). However, simulated radar and lidar data are based on mathematical models that try to mimic electromagnetic wave propagation in a real-life traffic scenario. These models are based on numerous assumptions and simplifications that render synthetic data different from real-life measurements especially in complex environments, e.g. environments with multiple propagation paths and reflective structures.
The generation of reflected signals in a multipath propagation environment is mainly based on ray tracing techniques, where space is discretized in multiple paths selected based on the primary detected objects. This discretization provides a limited view of what is really reflected, because small objects (of interest for radar systems) have a non-negligible impact (e.g. in discretized ray tracing techniques road borders are neglected, while buildings are not). In addition, when these reconstruction techniques are used, many assumptions about the type of materials are made and the closest permittivity and permeability are selected among a pool of available values. All these approximations add an extra layer of incertitude and errors on the simulated reflected signals/data which render the obtained results very far from reality.
In Yan Wan et al. “Pseudo-LiDAR from Visual Depth Estimation: Bridging the Gap in 3D Object Detection for Autonomous Driving”, (Conference on Computer Vision and Pattern Recognition (CVPR) 2019 in Long Beach, Calif., Jun. 16-20 2019) the lidar signal/data is mimed from image input in order to apply a lidar based algorithm for object detection and identification.
A stereoscopic camera was used in Yan Wang et al. “Anytime Stereo Image Depth Estimation on Mobile Devices”, May 2019 (https://ieeexplore.ieee.org/abstract/document/8794003/) in order to test the depth estimation and compare it to lidar measurements. The point cloud used here was for determining a distance ground truth.
In Yan Wang et al. “PointSeg: Real-Time Semantic Segmentation Based on 3D LiDAR Point Cloud”, September 2018 (https://arxiv.org/abs/1807.06288) a convolutional neural network is applied to a spherical image generated from a dense 3D lidar point cloud. The machine learning algorithm was trained with spherical images and labelled based on a mask dataset generated for images.
KR1020010003423 discloses an apparatus and method for generating object label images in a video sequence not making use of radar or lidar data.
CN108921925A discloses object identification by applying data fusion between camera and lidar data. The lidar data is labelled after processing, i.e. a high-level labelling is performed.
It is an object of the disclosure to provide a sensing apparatus and method allowing to accurately label the un-labelled point cloud data provided by radar and/or lidar sensors in a traffic scenario involving one or more vehicles.
The foregoing and other objects are achieved by the subject matter of the independent claims. Further implementation forms are apparent from the dependent claims, the description and the figures.
Generally, the disclosure provides a sensing apparatus and method for an automatic labelling of collected low-level, e.g. raw point cloud data generated by radar or lidar sensors in a traffic scenario involving one or more vehicles. The sensing apparatus may be implemented as a component of one of the vehicles involved in the traffic scenario or as a stand-alone unit. The sensing apparatus and method take advantage of external resources of information/data that may be collected by means of other sensors available on the vehicle, such as, but not limited to, image capturing sensors, such as single/multiple, simple/stereoscopic cameras, internal sensors such as, but not limited to, accelerometers, magnetometers, gyroscope sensors, odometers, GPS sensors, or sensors for assessing the wireless communication infrastructure in the environment of the traffic scenario.
In an example, according to a first aspect the disclosure relates to a sensing apparatus, comprising one or more radar and/or lidar sensors configured to collect a plurality of position, e.g. distance and/or direction measurement values for a plurality of objects associated with a traffic scenario in the vicinity of the apparatus; and a processing circuitry configured to obtain auxiliary data associated with one or more of the plurality of objects in the vicinity of the apparatus and to assign, e.g. map a respective position measurement value of the plurality of position measurement values to a respective object of the plurality of objects in the vicinity of the apparatus on the basis of the auxiliary data. The sensing apparatus may be implemented as a component of a vehicle, e.g. a car. Advantageously, the sensing apparatus allows taking advantage of additional resources of information for labelling the raw data, e.g. the plurality of measurement values for a plurality of objects associated with the traffic scenario in the vicinity of the apparatus.
In a further possible implementation form of the first aspect, the auxiliary data comprises one or more images of the one or more of the plurality of objects in the vicinity of the apparatus. Advantageously, this allows the sensing apparatus to implement efficient image processing techniques for identifying the objects in the vicinity of the apparatus in the one or more images and mapping the plurality of position measurement values to the identified objects.
In a further possible implementation form of the first aspect, the sensing apparatus further comprises one or more cameras configured to capture the one or more images of the one or more of the plurality of objects in the vicinity of the apparatus. Advantageously, this allows the sensing apparatus to be easily integrated in an already existing hardware structure of a vehicle including one or more cameras, such as a dashboard camera of the vehicle.
In a further possible implementation form of the first aspect, the one or more cameras comprise a stereoscopic camera configured to capture the one or more images as one or more stereoscopic images of the one or more of the plurality of objects in the vicinity of the apparatus and/or an omnidirectional camera configured to capture the one or more images as one or more omnidirectional images of the one or more of the plurality of objects in the vicinity of the apparatus. In case of a stereoscopic camera, this allows the sensing apparatus to determine a distance of the identified object as well and, therefore, to provide a more accurate mapping of the plurality of position measurement values to the identified objects. In case of an omnidirectional camera, the sensing apparatus may identify all or nearly all objects in the vicinity of the sensing apparatus and, thereby, provide a more complete mapping of the plurality of position measurement values to the identified objects.
In a further possible implementation form of the first aspect, the processing circuitry is configured to determine on the basis of the one or more images a respective auxiliary position, e.g. distance and/or direction value for a respective object of the one or more of the plurality of objects in the vicinity of the apparatus and to assign a respective position measurement value of the plurality of position measurement values to a respective object of the plurality of objects in the vicinity of the apparatus on the basis of the respective auxiliary position value of the respective object of the one or more of the plurality of objects in the vicinity of the apparatus. Advantageously, this allows the sensing apparatus to provide a more accurate mapping of the plurality of position measurement values to the identified objects in the vicinity of the apparatus.
In a further possible implementation form of the first aspect, the processing circuitry is further configured to identify on the basis of the one or more images a respective object of the one or more of the plurality of objects in the vicinity of the apparatus. Advantageously, this allows the sensing apparatus to implement efficient image processing techniques for identifying the objects in the vicinity of the apparatus in the one or more images and mapping the plurality of position measurement values to the identified objects.
In a further possible implementation form of the first aspect, the processing circuitry is further configured to implement a neural network for identifying on the basis of the one or more images a respective object of the one or more of the plurality of objects in the vicinity of the apparatus. Advantageously, this allows the neural network implemented by the sensing apparatus to be trained in advance on the basis of training data and/or in use on the basis of real data and, thereby, provide a more accurate object identification.
In a further possible implementation form of the first aspect, the processing circuitry is further configured to determine on the basis of the one or more images a respective angular extension value of a respective object of the one or more of the plurality of objects in the vicinity of the apparatus and to assign a respective position measurement value of the plurality of position measurement values to a respective object of the plurality of objects in the vicinity of the apparatus on the basis of the respective angular extension value of the respective object of the one or more of the plurality of objects in the vicinity of the apparatus. Advantageously, this allows the sensing apparatus to provide a more accurate mapping of the plurality of position measurement values to the identified objects in the vicinity of the apparatus.
In a further possible implementation form of the first aspect, the one or more images comprise a temporal sequence of images of the one or more of the plurality of objects in the vicinity of the apparatus, wherein the one or more radar and/or lidar sensors are further configured to collect based on the Doppler effect a plurality of velocity measurement values for the plurality of objects in the vicinity of the apparatus, wherein the processing circuitry is further configured to determine on the basis of the temporal sequence of images a respective auxiliary velocity value of a respective object of the one or more of the plurality of objects in the vicinity of the apparatus and to assign a respective position measurement value of the plurality of position measurement values to a respective object of the plurality of objects in the vicinity of the apparatus on the basis of the plurality of velocity measurement values for the plurality of objects in the vicinity of the apparatus and the respective auxiliary velocity value of the respective object of the one or more of the plurality of objects in the vicinity of the apparatus. Advantageously, this allows the sensing apparatus to provide a more accurate mapping of the plurality of position measurement values to the identified objects in the vicinity of the apparatus.
In a further possible implementation form of the first aspect, the auxiliary data comprises data provided by an accelerometer sensor, a magnetometer sensor, a gyroscope sensor, an odometer sensor, a GPS sensor, an ultrasonic sensor, and/or a microphone sensor, map data of the vicinity of the apparatus, and/or network coverage data in the vicinity of the apparatus. These sensors may be implemented as a component of the sensing apparatus or as a component of the vehicle the sensing apparatus is implemented in. Advantageously, this allows the sensing apparatus to be easily integrated in an already existing hardware structure of a vehicle including one or more of these sensors.
According to a second aspect the disclosure relates to a sensing method, comprising the steps of collecting by one or more radar and/or lidar sensors of an apparatus a plurality of position, e.g. distance and/or direction measurement values for a plurality of objects of a traffic scenario in the vicinity of the apparatus; obtaining auxiliary data associated with one or more of the plurality of objects in the vicinity of the apparatus; and assigning, e.g. mapping a respective position measurement value of the plurality of position measurement values to a respective object of the plurality of objects in the vicinity of the apparatus on the basis of the auxiliary data.
The sensing method according to the second aspect of the disclosure can be performed by the sensing apparatus according to the first aspect of the disclosure. Further features of the sensing method according to the second aspect of the disclosure result directly from the functionality of the sensing apparatus according to the first aspect of the disclosure and its different implementation forms described above and below.
According to a third aspect the disclosure relates to a computer program comprising program code which causes a computer or a processor to perform the method according to the second aspect when the program code is executed by the computer or the processor. The computer program may be stored on a non-transitory computer-readable storage medium of a computer program product. The different aspects of the disclosure can be implemented in software and/or hardware.
Details of one or more embodiments are set forth in the accompanying drawings and the description below. Other features, objects, and advantages will be apparent from the description, drawings, and claims.
In the following embodiments of the disclosure are described in more detail with reference to the attached figures and drawings.
In the following identical reference signs refer to identical or at least functionally equivalent features.
In the following description, reference is made to the accompanying figures, which form part of the disclosure, and which show, by way of illustration, aspects of embodiments of the disclosure or aspects in which embodiments of the present disclosure may be used. It is understood that embodiments of the disclosure may be used in other aspects and comprise structural or logical changes not depicted in the figures. The following detailed description, therefore, is not to be taken in a limiting sense, and the scope of the present disclosure is defined by the appended claims.
For instance, it is to be understood that a disclosure in connection with a described method may also hold true for a corresponding device or system configured to perform the method and vice versa. For example, if one or a plurality of method steps are described, a corresponding device may include one or a plurality of units, e.g. functional units, to perform the described one or plurality of method steps (e.g. one unit performing the one or plurality of steps, or a plurality of units each performing one or more of the plurality of steps), even if such one or more units are not explicitly described or illustrated in the figures. On the other hand, for example, if an apparatus is described based on one or a plurality of units, e.g. functional units, a corresponding method may include one step to perform the functionality of the one or plurality of units (e.g. one step performing the functionality of the one or plurality of units, or a plurality of steps each performing the functionality of one or more of the plurality of units), even if such one or plurality of steps are not explicitly described or illustrated in the figures. Further, it is understood that the features of the various exemplary embodiments and/or aspects described herein may be combined with each other, unless noted otherwise.
As illustrated in
For collecting data about the respective positions of the plurality of objects 107 involved in the traffic scenario 100 the sensing apparatus 101 comprises one or more radar and/or lidar sensors 103. In the embodiment shown in
Moreover, the sensing apparatus 101 comprises a processing circuitry 102 configured to perform, conduct or initiate various operations of the sensing apparatus 101 described in the following. The processing circuitry may comprise hardware and software. The hardware may comprise analog circuitry or digital circuitry, or both analog and digital circuitry. The digital circuitry may comprise components such as application-specific integrated circuits (ASICs), field-programmable arrays (FPGAs), digital signal processors (DSPs), or multi-purpose processors. In one embodiment, the processing circuitry comprises one or more processors and a non-transitory memory connected to the one or more processors. The non-transitory memory may carry executable program code which, when executed by the one or more processors, causes the apparatus 101 to perform, conduct or initiate the operations or methods described below.
In particular, the processing circuitry 102 is configured to obtain auxiliary data associated with the plurality of objects 107 in the vicinity of the car 106 and to assign, e.g. map a respective position measurement value of the plurality of position measurement values provided by the radar and/or lidar sensors 103 to a respective object of the plurality of objects 107, as will be described in more detail further below.
In the embodiment shown in
In the embodiment shown in
In a further exemplary embodiment shown in
The radar and/or lidar measurements and the auxiliary data, for instance, image data constitute two synchronized sets of data, namely a first set consisting of a random set of sparse data acquisitions/measurements provided by the radar and/or lidar sensors 103 and a second set consisting of the auxiliary data, e.g. a sequence of images provided by the cameras 105 and containing information about plurality of objects 107 involved in the traffic scenario 100 in the vicinity of the car 106. According to an exemplary embodiment, the processing circuitry 102 of the sensing apparatus 101 may be configured to identify and label the sparse point cloud data by implementing the following processing stages.
1. Processing the image feeds constituting the auxiliary data in order to identify the position and type of each object 107 in the vicinity of the car 106;
2. Superposing the map of identified objects through the processing of the camera feed with the synchronized acquired point cloud data provided by the radar and/or lidar sensors 103;
3. Identifying a mapping between the point cloud elements and the objects 107 that are identified and generated through the image processing in step 1.
4. Label the point cloud accordingly.
Although in the example described above the auxiliary data comprises image data of the objects 107 in the vicinity of the car 106, it will be appreciated that other types of data providing information about the objects 107 in the vicinity of the car 106 may be used as auxiliary data in addition to or instead of the image data. For instance, the auxiliary data may be obtained by the sensing apparatus 101 at the level of the car 106, such as odometry data, positioning data provided by external sources such as maps, and/or wireless network information, such as information wireless network heatmaps providing information about wireless network coverage. According to an embodiment, any data may be used as auxiliary data for labelling the point data points provided by the radar and/or lidar sensors 103, wherein the data has the following properties.
1. The data is or can be synchronized with the point cloud data acquired by the radar and/or lidar sensors 103.
2. The data can be efficiently processed by the processing circuitry 102 of the sensing apparatus 101 using suitable processing techniques that provide a reliable recognition of the objects 107 in the vicinity of the car 106.
As will be appreciated, the above exemplary embodiment may be extended to multiple sources and/or types of auxiliary data, irrespective of whether they are of the same type or heterogeneous in nature. The various sources of auxiliary data can be either considered as complementary in order to enhance the coverage, the granularity of the detection and/or the quality of the detection through data fusion techniques.
Using one or more of the techniques described above, the sensing apparatus 101 allows generating a database associated with a real-world traffic scenario 100 with real-world data containing point cloud information that are labeled based on reliable identification techniques. The generated database may be used, for instance, for point cloud algorithm design with an embedded reliable baseline that provides objective performance evaluation. It should be noted that the sensing apparatus 101 provides for an automated point cloud labelling at low level, e.g. labelling raw data, using the auxiliary data. The sensing apparatus 101 does not process the point cloud data provided by the radar and/or lidar sensors 103 for object identification, rather only the auxiliary data, e.g. information from other sources than the radar and/or lidar sensors 103 are taken into account for object identification and labelling of the point cloud on the basis thereof
These images are fed to a machine learning algorithm for object detection and classification as implemented by processing block 303 of
According to a further embodiment, the processing circuitry 102 of the apparatus 101 may be further configured to determine the relative speed and the radial speed of an identified object 107 relative to the car 106 and, consequently, the apparatus 101, by measuring the change of the distance of an identified object 107 from the car 106 in consecutive image frames.
Once the distance and the speed have been determined for each of the detected objects 107, the processing circuitry 102 of the apparatus 101 is configured to map the point cloud of data obtained from the radar and/or lidar sensors 103 in processing block 302 of
As will be appreciated and as already mentioned above, in the exemplary embodiment shown in
In the following, two exemplary embodiments will be described in the context of
In the first exemplary embodiment, which will be described in more detail in the context of
Using the techniques described above, in particular in the context of
According to an embodiment, the processing circuitry 102 is further configured to determine on the basis of these objects 107 and their position in the image, as illustrated in
As will be appreciated, the processing techniques employed in the first exemplary embodiment may be enhanced by more advanced processing techniques, such as by using multiple images of the traffic scenario 100 in the vicinity of the car 106 from more than one camera 105 and/or by using cross image object tracking for consistency and ease of detection. This may also be helpful for handling hidden objects to the camera(s) 105, but visible to the radar sensors 103.
The second exemplary embodiment, which will be described in more detail in the context of
The person skilled in the art will understand that the “blocks” (“units”) of the various figures (method and apparatus) represent or describe functionalities of embodiments of the disclosure (rather than necessarily individual “units” in hardware or software) and thus describe equally functions or features of apparatus embodiments as well as method embodiments (unit =step).
In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus, and method may be implemented in other manners. For example, the described apparatus embodiment is merely exemplary. For example, the unit division is merely logical function division and may be other division in actual implementation. For example, a plurality of units or components may be combined or integrated into another system, or some features may be ignored or not performed. In addition, the displayed or discussed mutual couplings or direct couplings or communication connections may be implemented by using some interfaces. The indirect couplings or communication connections between the apparatuses or units may be implemented in electronic, mechanical, or other forms.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one position, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
In addition, functional units in the embodiments of the disclosure may be integrated into one processing unit, or each of the units may exist alone physically, or two or more units are integrated into one unit.
This application is a continuation application of International Patent Application No. PCT/CN2019/123052, filed on Dec. 4 2019, the disclosure of which is hereby incorporated by reference in its entirety.
Number | Date | Country | |
---|---|---|---|
Parent | PCT/CN2019/123052 | Dec 2019 | US |
Child | 17830987 | US |