The present application generally relates to autonomous vehicles and, more particularly, to object detection and tracking with a deep neural network (DNN) fused with depth clustering in light detection and ranging (LIDAR) point clouds.
Some vehicles are equipped with an advanced driver assistance (ADAS) or autonomous driving system that is configured to perform one or more assistance or autonomous driving features (adaptive cruise control, lane centering, collision avoidance, etc.). Many of these features utilized deep neural networks (DNNs) and various input data (light detection and ranging (LIDAR) point cloud data, camera images, etc.) to generate determinative outputs, such as object detection/classification. DNNs work well for object detection/classification, but they are computationally intensive and thus may require substantial hardware or processing resources. In addition, DNNs are not ideally suited for object tracking because they are relatively slow, which could decrease performance or increase costs due to the implementation of additional hardware resources (e.g., more expensive processing units. Accordingly, while these conventional systems do work well for their intended purpose, there exists an opportunity for improvement in the relevant art.
According to one example aspect of the invention, an object detection and tracking system for an autonomous driving feature of a vehicle is presented. In one exemplary implementation, the system comprises: a light detection and ranging (LIDAR) system configured to capture LIDAR point cloud data external to the vehicle and a controller configured to: access a deep neural network (DNN) trained for object detection, run the DNN on the LIDAR point cloud data at a first rate to detect a first set of objects and a region of interest (ROI) comprising the first set of objects, and depth cluster the LIDAR point cloud data for the detected ROI at a second rate to detect and track a second set of objects comprising the first set of objects and any new objects that subsequently appear in a field of view of the LIDAR system, wherein the second rate is greater than the first rate, wherein the depth clustering continues until a subsequent second iteration of the DNN is run to thereby accurately detect and track the second set of objects with robustness to noise while also reducing hardware requirements corresponding to the DNN.
In some implementations, the depth clustering to detect and track the second set of objects comprises performing a procedure comprising: generating first and second lines from the LIDAR sensor to first and second points in the LIDAR point cloud data for the detected ROI, generating a third line connecting the first and second points, determining an angle between the first and third lines, and determining that the first and second points belong to a same object when the angle exceeds a calibrated threshold. In some implementations, the depth clustering to detect and track the second set of objects comprises performing the procedure for a plurality of pairs of points in the LIDAR point cloud data for the detected ROI.
In some implementations, the controller is further configured to run the DNN again at the first rate to detect a third set of objects and a new or updated ROI comprising the third set of objects. In some implementations, the controller is further configured to associate the second and third sets of objects to synchronize the DNN and depth clustering procedures and obtain a fourth set of objects. In some implementations, the controller is further configured to restart the depth clustering for the new or updated ROI at the second rate to detect and track a fifth set of objects comprising the fourth set of objects and any new objects that subsequently appear in the field of view of the LIDAR system.
In some implementations, the first and second rates are calibrated based on a set of vehicle parameters that affect how aggressive object detection and tracking should be performed. In some implementations, the set of vehicle parameters comprises vehicle speed. In some implementations, the set of vehicle parameters comprises the field of view of the LIDAR system.
According to another example aspect of the invention, an object detection and tracking method for a vehicle is presented. In one exemplary implementation, the method comprises: accessing, by a controller of the vehicle, a DNN trained for object detection, receiving, by the controller and from a LIDAR system of the vehicle, LIDAR point cloud data external to the vehicle, running, by the controller, the DNN on the LIDAR point cloud data at a first rate to detect a first set of objects and a region of interest (ROI) comprising the first set of objects, and depth clustering, by the controller, the LIDAR point cloud data for the detected ROI at a second rate to detect and track a second set of objects comprising the first set of objects and any objects that subsequently appear in a field of view of the LIDAR system, wherein the second rate is greater than the first rate, wherein the depth clustering continues until a subsequent second iteration of the DNN is run to thereby accurately detect and track the second set of objects with robustness to noise while also reducing hardware requirements corresponding to the DNN.
In some implementations, the depth clustering to detect and track the second set of objects comprises performing, by the controller, a procedure comprising: generating first and second lines from the LIDAR sensor to first and second points in the LIDAR point cloud data for the detected ROI, generating a third line connecting the first and second points, determining an angle between the first and third lines, and determining that the first and second points belong to a same object when the angle exceeds a calibrated threshold. In some implementations, the depth clustering to detect and track the second set of objects comprises performing, by the controller, the procedure for a plurality of pairs of points in the LIDAR point cloud data for the detected ROI.
In some implementations, the method further comprises running, by the controller, the DNN again at the first rate to detect a third set of objects and a new or updated ROI comprising the third set of objects. In some implementations, the method further comprises associating, by the controller, the second and third sets of objects to synchronize the DNN and depth clustering procedures and obtain a fourth set of objects. In some implementations, the method further comprises restarting, by the controller, the depth clustering for the new or updated ROI at the second rate to detect and track a fifth set of objects comprising the fourth set of objects and any new objects that subsequently appear in the field of view of the LIDAR system.
In some implementations, the first and second rates are calibrated based on a set of vehicle parameters that affect how aggressive object detection and tracking should be performed. In some implementations, the set of vehicle parameters comprises vehicle speed. In some implementations, the set of vehicle parameters comprises the field of view of the LIDAR system.
Further areas of applicability of the teachings of the present disclosure will become apparent from the detailed description, claims and the drawings provided hereinafter, wherein like reference numerals refer to like features throughout the several views of the drawings. It should be understood that the detailed description, including disclosed embodiments and drawings referenced therein, are merely exemplary in nature intended for purposes of illustration only and are not intended to limit the scope of the present disclosure, its application or uses. Thus, variations that do not depart from the gist of the present disclosure are intended to be within the scope of the present disclosure.
As discussed above, there exists an opportunity for improvement in the art of vehicle object detection and tracking. Accordingly, ADAS and autonomous driving systems and methods having improved object detection and tracking performance are presented. For simplicity, the term “autonomous” will hereinafter be used, but it will be appreciated that this encompasses both fully-autonomous (L3, L4, etc.) and semi-autonomous (e.g., ADAS) features (adaptive cruise control, lane centering, collision avoidance, etc.). The techniques of the present disclosure utilize a DNN trained for object detection to detect a region of interest (ROI) in a LIDAR point cloud, where the detected ROI comprises one or more detected objects. The DNN could be run at a first rate as it is computationally intensive. The output of the DNN (the detected ROI and the one or more detected objects) is then fused with depth clustering, which is run on the output of the DNN at a second (e.g., faster) rate to track the object(s) and to also detect and track any new objects showing up in the field of view.
Once the DNN is run again, an association is performed between the objects detected by the DNN on its subsequent run and the objects that were being detected and tracked during the depth clustering. This effectively synchronizes the DNN and the depth clustering procedures. After this, the depth clustering restarts with all of the associated or verified detected objects. The first and second rates could also be calibrated based on vehicle operating parameters (vehicle speed, LIDAR field of view, etc.). The potential benefits include accurate and noise robust object detection and tracking via the depth clustering tracking along with accurate object detection with reduced hardware or processing requirements of the DNN.
Referring now to
The autonomous driving system of the present disclosure therefore generally comprises the controller 112, a LIDAR system 120, and one or more other sensor systems 124 (vehicle speed sensor, RADAR, camera system etc.). The LIDAR system 120 is configured to emit light pulses and capture reflected light pulses that collectively form a LIDAR point cloud, with each point in the LIDAR point cloud having corresponding depth information (e.g., based on a wavelength of the reflected light pulse). An external or separate calibration system 114 could also be implemented to train the DNN and upload it to the controller 112. The controller 112 is also configured to perform at least a portion of the object detection and tracking techniques of the present disclosure, which will now be discussed in greater detail.
Referring now to
This depth clustering 224 runs at a second rate that is in most cases faster than the first rate at which the DNN 208 runs. This is because the depth clustering 224 is very fast and does not require as substantial of processing or hardware resources as the DNN 208. Once the DNN 208 runs again and association/synchronization occurs, the depth clustering 224 will then restart using the updated outputs (ROI and object(s)). The first and second rates are calibratable, and could vary based on various vehicle operating parameters indicative of an aggressiveness of the object detection and tracking. Non-limiting examples of these parameter(s) include vehicle speed and a field of view of the LIDAR system 120. For example, at higher vehicle speeds, the first and/or second rates may be higher.
Referring now to
Referring now to
In
A fourth pair of points 320, however, does not meet the criteria described above (i.e., angle β is not greater than the calibratable threshold, and thus this pair of points is determined to not be of the same object, which can also be visually seen by the different circular regions 304, 308. By performing this depth clustering across all or a majority of the ROI of the LIDAR point cloud data, and by knowing the object(s) detected by the DNN in the first place, the object(s) can be quickly identified and accurately tracked over time, while also providing a high level of robustness to noise.
Referring now to
When true, the method 400 ends or returns and restarts. Otherwise, the controller 112 determines whether it is time to run the DNN again according to the first rate at 424. When false, the method 400 returns to 416 and depth clustering based object detection and tracking continues at the second rate. Otherwise, the method 400 proceeds to 428 where the DNN runs again to detect a new or updated ROI and object(s) therein. At 432, an association between the objects detected by the DNN during this subsequent run and the objects that were previously being detected and tracked by depth clustering is performed. This effectively synchronizes the DNN and depth clustering procedures. The method 400 then returns to 416 where depth clustering restarts using the updated detected objects and ROI at the second rate and the process continues until the exit condition(s) are present.
For example only, assume that the DNN initially detects 5 objects at 412. Depth clustering is then performed at 416 to begin detecting and tracking these 5 objects from the previous LIDAR point cloud frames. Additionally, the depth clustering at 416 detects and tracks objects outside of or other than the 5 detected objects. This could be, for example, new object(s) that subsequently show up in the field of view. For example only, there could be 2 new objects detected and thus the depth clustering at 416 could be detecting and tracking 7 total objects. Once it is time for the DNN to run again at 424-428, the DNN runs independently and now detects 9 objects in a new or updated ROI. At this point, object association needs to be performed because there are duplicates between the 7 objects previously being detected and tracked by the depth clustering at 416 and the 9 objects independently detected by the DNN. After the association, for example, 7 of the 9 DNN objects could be determined to be the same 7 objects that the depth clustering was detecting and tracking at 416. This association could be as simple as, for example, using a distance between two objects (i.e., if they are close enough, or their distance between is below a threshold, then they can be treated as the same object). The process then continues by depth clustering detecting and tracking the 9 objects at 416.
It will be appreciated that the term “controller” as used herein refers to any suitable control device or set of multiple control devices that is/are configured to perform at least a portion of the techniques of the present disclosure. Non-limiting examples include an application-specific integrated circuit (ASIC), one or more processors and a non-transitory memory having instructions stored thereon that, when executed by the one or more processors, cause the controller to perform a set of operations corresponding to at least a portion of the techniques of the present disclosure. The one or more processors could be either a single processor or two or more processors operating in a parallel or distributed architecture.
It should also be understood that the mixing and matching of features, elements, methodologies and/or functions between various examples may be expressly contemplated herein so that one skilled in the art would appreciate from the present teachings that features, elements and/or functions of one example may be incorporated into another example as appropriate, unless described otherwise above.