END-TO-END PROCESSING IN AUTOMATED DRIVING SYSTEMS

Information

  • Patent Application
  • 20230294687
  • Publication Number
    20230294687
  • Date Filed
    February 14, 2023
    a year ago
  • Date Published
    September 21, 2023
    8 months ago
Abstract
The described aspects and implementations enable efficient object detection and tracking. In one implementation, disclosed is a method and a system to perform the method, the system including the sensing system configured to obtain sensing data characterizing an environment of the vehicle. The system further includes a data processing system operatively coupled to the sensing system and configured to process the sensing data using a first (second) set of neural network (NN) layers to obtain a first (second) set of features for a first (second) region of the environment, the first (second) set of features is associated with a first (second) spatial resolution. The data processing system is further to process the two sets of features using a second set of NN layers to detect a location of obj ect(s) in the environment of the vehicle and a state of motion of the object(s).
Description
Claims
  • 1. A method comprising: obtaining, by a processing device, an input data characterizing an environment of a vehicle, wherein the input data comprises at least one of a lidar sensing data, a radar sensing data, or a camera sensing data; andprocessing, by the processing device, the input data using a first set of neural network layers to obtain: a first set of features for a first region of the environment, wherein the first set of features is associated with a first spatial resolution, anda second set of features for at least a second region of the environment, wherein the second set of features is associated with a second spatial resolution; andprocessing, by the processing device, the first set of features and the second set of features using a second set of neural network layers to identify one or more objects in the environment of the vehicle.
  • 2. The method of claim 1, wherein the input data comprises a three-dimensional (3D) set of voxels, wherein each voxel of at least a subset of the 3D set of voxels comprises a distance to a portion of the environment represented by a respective voxel.
  • 3. The method of claim 2, wherein obtaining the input data comprises: preprocessing the camera sensing data using a lifting transform, wherein the lifting transform converts a two-dimensional (2D) set of pixels into the 3D set of voxels.
  • 4. The method of claim 2, wherein the 3D set of voxels comprises: a first portion of voxels having the first spatial resolution and depicting the first region of the environment,a second portion of voxels having the second spatial resolution and depicting the second region of the environment, anda third portion of voxels associated with a boundary between the first region and the second region and comprising voxels interpolated between voxels of the first portion and voxels of the second portion.
  • 5. The method of claim 1, wherein the input data further comprises roadgraph data that maps a drivable portion of the environment of the vehicle.
  • 6. The method of claim 5, wherein processing the first set of features and the second set of features is further to update the roadgraph data with a current state of the drivable portion of the environment of the vehicle.
  • 7. The method of claim 6, wherein the current state of the drivable portion of the environment of the vehicle comprises a status of one or more traffic lights in the environment of the vehicle.
  • 8. The method of claim 1, wherein the second set of neural network layers comprises a common backbone and a plurality of classifier heads receiving inputs generated by the common backbone, wherein the plurality of classifier heads comprises one or more of: a segmentation head,an occupancy head,a traffic flow head,an object occlusion head, ora roadgraph head.
  • 9. The method of claim 1, wherein processing the first set of features and the second set of features using the second set of neural network layers is further to identify a state of motion of a first object of the one or more objects.
  • 10. The method of claim 9, wherein the state of motion of the first object is identified for a plurality of times, the method further comprising: predicting, using at least the state of motion of the first object for the plurality of times, a trajectory of the first object.
  • 11. The method of claim 1, further comprising: causing a driving path of the vehicle to be modified in view of the identified one or more objects.
  • 12. A system comprising: a sensing system of a vehicle, the sensing system configured to: obtain an input data characterizing an environment of the vehicle, wherein the input data comprises at least one of a lidar sensing data, a radar sensing data, or a camera sensing data; anda perception system of the vehicle, the perception system configured to: process the input data using a first set of neural network layers to obtain:a first set of features for a first region of the environment, wherein the first set of features is associated with a first spatial resolution, anda second set of features for a second region of the environment, wherein the second set of features is associated with a second spatial resolution; andthe first set of features and the second set of features using a second set of neural network layers to identify one or more objects in the environment of the vehicle.
  • 13. The system of claim 12, wherein the input data comprises a three-dimensional (3D) set of voxels, wherein each voxel of at least a subset of the 3D set of voxels comprises a distance to a portion of the environment represented by a respective voxel.
  • 14. The system of claim 13, wherein to obtain the input data, the sensing system is to preprocess the camera sensing data using a lifting transform, wherein the lifting transform converts a two-dimensional (2D) set of pixels into the 3D set of voxels.
  • 15. The system of claim 12, wherein the input data further comprises roadgraph data that maps a drivable portion of the environment of the vehicle, and wherein the second set of neural network layers is further to update the roadgraph data with a current state of the drivable portion of the environment of the vehicle.
  • 16. The system of claim 15, wherein the current state of the drivable portion of the environment of the vehicle comprises a status of one or more traffic lights in the environment of the vehicle.
  • 17. The system of claim 12, wherein the second set of neural network layers comprises a common backbone and a plurality of classifier heads receiving inputs generated by the common backbone, wherein the plurality of classifier heads comprises one or more of: a segmentation head,an occupancy head,a traffic flow head,an object occlusion head, ora roadgraph head.
  • 18. The system of claim 12, wherein the second set of neural network layers is further to identify a state of motion of a first object of the one or more objects for a plurality of times, and wherein the perception system is further to: predict, using at least the state of motion of the first object for the plurality of times, a trajectory of the first object.
  • 19. The system of claim 12, wherein the perception system is further to: cause a driving path of the vehicle to be modified in view of the identified one or more objects.
  • 20. A non-transitory computer-readable storage medium storing instructions that, when executed by a processing device, cause the processing device to: obtain an input data characterizing an environment of a vehicle, wherein the input data comprises at least one of a lidar sensing data, a radar sensing data, or a camera sensing data; andprocessing the input data using a first set of neural network layers to obtain: a first set of features for a first region of the environment, wherein the first set of features is associated with a first spatial resolution, anda second set of features for at least a second region of the environment, wherein the second set of features is associated with a second spatial resolution; andprocess the first set of features and the second set of features using a second set of neural network layers to identify one or more objects in the environment of the vehicle.
Provisional Applications (1)
Number Date Country
63310457 Feb 2022 US