This application is the national phase entry of International Application No. PCT/CN2021/119506, filed on Sep. 22, 2021, which is based upon and claims priority to Chinese Patent Application No. 202110718013.2, filed on Jun. 28, 2021, the entire contents of which are incorporated herein by reference.
The present disclosure relates to a three-dimensional (3D) point cloud localization technology, and in particular to a normal distributions transform (NDT) method for LiDAR point cloud localization in unmanned driving.
The normal distributions transform (NDT) algorithm is widely used to register two-dimensional (2D) or three-dimensional (3D) point clouds in smart vehicles, 3D reconstruction, motion estimation, object detection and pose estimation, simultaneous localization and mapping (SLAM), etc. In the field of smart vehicles, localization is the most basic and important task, and the NDT-based localization system is widely used in autonomous driving systems using a lidar as the main sensor.
The NDT algorithm segments a target point cloud into a series of grids and uses a normal distribution to represent the distribution of points in each point cloud. It converts point cloud to point cloud registration into point cloud to normal distribution registration, thereby improving the speed and robustness of the algorithm.
The current autonomous driving system usually uses a lidar of 64, 128 or more channels as the main sensor of the localization algorithm. The sampling frequency of the lidar is gradually increased, and will reach 30 Hz in the future. In other words, the number of points input by the lidar per second will be close to one million, which poses a great challenge to the real-time performance of the localization system. A test shows that the traditional NDT algorithm can only reach the input frequency of 2 Hz on an embedded advanced reduced instruction-set computer (RISC) machine (ARM) platform, which is far from meeting the real-time requirement.
To this end, researchers are pursuing two primary goals. The first goal is to reduce the number of search iterations. Reference [1] proposed a multi-layered NDT algorithm to represent point clouds in order to reduce the number of iterations and measure longer distances. However, updating the NDT of all layers requires too much memory and unacceptable time. Reference [2] made improvements on Reference [1] and proposed a key-layered NDT algorithm, which only needs to search the key layer to satisfy the termination conditions of higher layers. Unfortunately, this method cannot meet the real-time requirement. The second goal is to reduce the running time per iteration. References [3] and [4] extended the point-to-distribution (P2D)-NDT to distribution-to-distribution (D2D)-NDT, thereby transforming a set of points into a distribution to reduce the running time per iteration. However, the D2D-NDT method suffers from poor accuracy and slow speed when dealing with massive and non-uniform lidar point clouds in smart vehicles. Reference [5] proposed semantic-assisted (SE)-NDT to classify point clouds and remove dynamic objects by using segmentation methods [6], [7] and [8]. However, the method proposed by Reference [5] can only remove a limited number of points and requires additional time for point cloud segmentation. Overall, these methods lack real-time performance when dealing with large point clouds.
With the increase in the number of point clouds generated by the new generation of lidar and the increase in the running speed of autonomous vehicles, it is increasingly difficult for the NDT algorithm running on the vehicle-mounted industrial computer to meet the real-time requirement. In addition, the strict constraint on the use of batteries by smart vehicles also puts forward an energy consumption requirement for the NDT algorithm. In this context, it is necessary to design an improved NDT algorithm, based on the distribution of laser point clouds in an autonomous driving scenario, to accelerate the localization process through an energy-efficient heterogeneous computing device, such as a field-programmable gate array (FPGA) or a graphics processing unit (GPU).
In order to meet the requirements of unmanned driving for real-time and efficient localization data, the present disclosure proposes a normal distributions transform (NDT) method for LiDAR point cloud localization in unmanned driving. The present disclosure establishes a non-recursive, memory-efficient data structure occupation-aware-voxel-structure (OAVS), which meets the real-time and high-precision requirements of smart vehicles for three-dimensional (3D) lidar localization.
The present disclosure adopts the following technical solution: an NDT method for LiDAR point cloud localization in unmanned driving. The method specifically includes the following steps:
where, ({circumflex over (p)}Mi−μoj)TΣo−1({circumflex over (p)}Mi−μoj) represents a transpose matrix of ({circumflex over (p)}Mi−μoj) multiplied by an inverse matrix of Σoj and multiplied by ({circumflex over (p)}Mi−μoj); μij represents the mean of the class j in the sub-voxel o; and Σoj represents the variance of the class j in the sub-voxel o;
according to the probability calculated in step 2.3.2), where, the pose transformation matrix T is solved by a Newton-Gauss iteration method in order for an optimal pose transformation matrix T and an optimal matching score;
Further, in step 2.2), during acquiring the NDT information, the segmented fixed point cloud may be streamed into an FPGA accelerator to establish the data structure OAVS, that is, to perform step 1), and to calculate the NDT information of each sub-voxel.
Further, in step 2.3), in a registration stage, the segmented moving point cloud may be streamed into the FPGA accelerator to be efficiently processed in the form of data stream and pipeline, so as to obtain the gradient matrix and the Hessian matrix.
The present disclosure has the following beneficial effects. The present disclosure proposes a non-recursive, memory-efficient data structure OAVS that speeds up each search operation. Compared with a tree-based structure, the proposed data structure OAVS is easy to parallelize and consumes only about 1/10 of memory. Based on the data structure OAVS, the present disclosure proposes a semantic-assisted OAVS-based (SEO)-NDT algorithm, which significantly reduces the number of search operations, redefines a parameter affecting the number of search operations, and removes a redundant search operation. In addition, the present disclosure proposes a streaming FPGA accelerator architecture, which further improves the real-time and energy-saving performance of the SEO-NDT algorithm.
The FIGURE is a diagram of a collaboration framework of a normal distributions transform (NDT) method for LiDAR point cloud localization in unmanned driving according to the present disclosure.
The present disclosure will be described in detail below with reference to the drawings and specific embodiments. The embodiments are implemented on the premise of the technical solutions of the present disclosure. The following presents detailed implementations and specific operation processes. The protection scope of the present disclosure, however, is not limited to the following embodiments.
The present disclosure provides a normal distributions transform (NDT) method for LiDAR point cloud localization in unmanned driving, which is implemented on a field-programmable gate array (FPGA) through a high-level synthesis tool. The present disclosure establishes a non-recursive, memory-efficient data structure occupation-aware-voxel-structure (OAVS), a real-time semantic-assisted OAVS-based (SEO)-NDT algorithm based on OAVS, and a streaming FPGA accelerator architecture. The method specifically includes the following steps:
and the variance is
the subscript o represents an o-th sub-voxel; the subscript j represents a j-th class; the subscript k represents a k-th point in the sub-voxel; pkoi represents coordinates (x,y,z) of the k-th point in a point set of the j-th class in the o-th sub-voxel; m represents a count of points in the point set of the j-th class in the o-th sub-voxel; and (pkoi−μoi)T represents a transpose matrix of (pkoi−μoi). If there are multiple classes of point clouds in a certain sub-voxel, the NDT information of all classes of point sets in the sub-voxel is calculated. That is, the counts of means and variances obtained are corresponding to the count of classes of the point clouds in the sub-voxel.
where, ({circumflex over (p)}Mi−μoj)TΣo−1({circumflex over (p)}Mi−μoj) represents a transpose matrix of ({circumflex over (p)}Mi−μoj) multiplied by an inverse matrix of Σoj and multiplied by ({circumflex over (p)}Mi−μoj).
according to the probability, where, the pose transformation matrix T is solved by a Newton-Gauss iteration method in order for an optimal pose transformation matrix T and an optimal matching score.
The pseudocode of the algorithm is as follows:
The streaming FPGA accelerator architecture is shown in the FIGURE. During acquiring the NDT information, the segmented fixed point cloud is streamed into an FPGA accelerator to establish the data structure OAVS and calculate the NDT information of each sub-voxel. In a registration stage, the segmented moving point cloud is streamed into the FPGA accelerator to be efficiently processed in the form of data stream and pipeline, so as to obtain the gradient matrix and the Hessian matrix.
The above described are merely several embodiments of the present invention. Although these embodiments are described specifically and in detail, they should not be construed as a limitation to the patent scope of the present disclosure. It should be noted that those of ordinary skill in the art may further make several variations and improvements without departing from the idea of the present disclosure, but such variations and improvements should all fall within the protection scope of the present disclosure. Therefore, the protection scope of the present disclosure should be subject to the appended claims.
Number | Date | Country | Kind |
---|---|---|---|
202110718013.2 | Jun 2021 | CN | national |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/CN2021/119506 | 9/22/2021 | WO |
Publishing Document | Publishing Date | Country | Kind |
---|---|---|---|
WO2023/272964 | 1/5/2023 | WO | A |
Number | Name | Date | Kind |
---|---|---|---|
11010928 | Mammou | May 2021 | B2 |
20210103780 | Mammou | Apr 2021 | A1 |
20210264640 | Mammou | Aug 2021 | A1 |
Number | Date | Country |
---|---|---|
111860340 | Oct 2020 | CN |
111949943 | Nov 2020 | CN |
112946681 | Jun 2021 | CN |
Entry |
---|
Srinara et al.“Performance Analysis of 3D NDT Scan Matching for Autonomous Vehicles Using INS/GNSS/3D LiDAR-SLAM Integration Scheme” IEEE (Year: 2021). |
Zhou et al. “NDT-Transformer: Large-Scale 3D Point Cloud Localisation using the Normal Distribution Transform Representation”, IEEE (Year: 2021). |
Javanmardi et al. “Autonomous vehicle self localization based on probabilistic planar surface map and multichannel LiDAR inurban area” IEEE. (Year: 2017). |
Qi Deng, et al., An Optimized FPGA-Based Real-Time NDT for 3D-LiDAR Localization in Smart Vehicles, IEEE Transactions on Circuits and Systems—II: Express Briefs, 2021, pp. 3167-3171, vol. 68, No. 9. |
Cihan Ulas, et al., A Fast and Robust Feature-Based Scan-Matching Method in 3D SLAM and the Effect of Sampling Strategies, International Journal of Advanced Robotic Systems, 2013, pp. 1-16, vol. 10, 396. |
Hyunki Hong, et al., Key-layered normal distributions transform for point cloud registration, Electronics Letters, 2015, pp. 1986-1988, vol. 51, No. 24. |
Todor Stoyanov, et al., Fast and accurate scan registration through minimization of the distance between compact 3D NDT representations, The International Journal of Robotics Research, 2012, pp. 1-17. |
Siew-Kei Lam, et al., Area-Time Efficient Streaming Architecture for FAST and BRIEF Detector, IEEE Transactions on Circuits and Systems—II: Express Briefs, 2018. |
Anestis Zaganidis, et al., Semantically Assisted Loop Closure in SLAM Using NDT Histograms, IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2019, pp. 4562-4568. |
Lin Bai, et al., RoadNet-RT: High Throughput CNN Architecture and SoC Design for Real-Time Road Segmentation, IEEE Transactions on Circuits and Systems—I: Regular Papers, 2021, pp. 704-714, vol. 68, No. 2. |
Xuepeng Chang, et al., A Mixed-Pruning Based Framework for Embedded Convolutional Neural Network Acceleration, IEEE Transactions on Circuits and Systems—I: Regular Papers, 2021, pp. 1706-1715, vol. 68, No. 4. |
Hongtu Zhang, et al., A 55nm, 0.4V 5526-TOPS/W Compute-in-Memory Binarized CNN Accelerator for AIoT Applications, IEEE Transactions on Circuits and Systems—II: Express Briefs, 2021, pp. 1695-1699, vol. 68, No. 5. |
Number | Date | Country | |
---|---|---|---|
20230192123 A1 | Jun 2023 | US |