Vehicles are becoming more intelligent as the industry moves towards deploying increasingly sophisticated self-driving technologies that are capable of operating a vehicle with little or no human input, and thus being semi-autonomous or autonomous. Autonomous and semi-autonomous vehicles may be able to detect information about their location and surroundings (e.g., using ultrasound, radar, lidar, an SPS (Satellite Positioning System), and/or an odometer, and/or one or more sensors such as accelerometers, cameras, etc.). Autonomous and semi-autonomous vehicles typically include a control system to interpret information regarding an environment in which the vehicle is disposed to identify hazards and determine a navigation path to follow. The designs of autonomous vehicles may utilize industry standards for guidance on verification and validation measures required to achieve the Safety Of The Intended Functionality (SOTIF). SOTIF is generally defined as the absence of unreasonable risk due to hazards resulting from functional insufficiencies of the intended functionality, or by reasonably foreseeable misuses by persons. Industry standards, such as the International Organization for Standardization (ISO) 21448 standard, may provide additional requirements for implementing SOTIF in autonomous and semi-autonomous vehicles.
An example method for generating object representations with multiple signal paths according to the disclosure includes obtaining image information from at least one camera module disposed on a vehicle, obtaining target information from at least one radar module disposed on the vehicle, generating a first detection representation with a first signal path based on the image information and the target information, generating a second detection representation with a second signal path based on the image information and the target information, wherein the second signal path is different than the first signal path, and outputting the first detection representation and the second detection representation.
An example apparatus according to the disclosure includes at least one memory, at least one camera module, at least one radar module, at least one processor communicatively coupled to the at least one memory, the at least one camera, and the at least one radar module, and configured to: obtain image information from the at least one camera module disposed on a vehicle, obtain target information from the at least one radar module disposed on the vehicle, generate a first detection representation with a first signal path based on the image information and the target information, generate a second detection representation with a second signal path based on the image information and the target information, wherein the second signal path is different than the first signal path, and output the first detection representation and the second detection representation.
Items and/or techniques described herein may provide one or more of the following capabilities, as well as other capabilities not mentioned. Multiple sensors, such as cameras, radar and lidar, may obtain target information for objects proximate to an autonomous or semi-autonomous vehicle. The sensor inputs may be evaluated via different signal paths. Machine learning models may be implemented along the different signal paths. Parametric and non-parametric representations of object data may be generated. The fusion of signals from different sensors may improve the sensitivity of object detection. The multiple signal paths may improve the robustness of object detection and corresponding environment models. SOTIF standards may be realized. Other capabilities may be provided and not every implementation according to the disclosure must provide any, let alone all, of the capabilities discussed.
Techniques are discussed herein for detecting objects proximate to a vehicle with multiple signal paths. Constructing robust environmental models is an important aspect for automated driving systems. Industry standards may require some level of redundancy to achieve SOTIF requirements. In an example, sensor-based redundancies may be implemented to reduce the impact of sensor failures. Redundancy may also be realized via the implementation of different signal paths. The signals received from various sensors, such as cameras, radar modules and lidar modules, may be processed jointly and fused in separate signal paths. In an example, a first signal path may be configured to generate parametric representations of objects based on a fusion of camera and radar inputs, and a second signal path may be configured to generate non-parametric representations of objects based on the camera and radar inputs. Other sensor inputs may also be used to generate the parametric and non-parametric representations of the objects. For example, various combinations of image, radar, and lidar signals may be fused to generate the representations. Machine learning models may be implemented to generate the representations of the detected objects. Different signal paths may be configured to use different backbones in the machine learning models. In an example, a common backbone may be utilized and separate training for each head may be enforced for handling redundancy. Other techniques, however, may be used.
Particular aspects of the subject matter described in this disclosure may be implemented to realize one or more of the following potential advantages. Redundancy requirements for SOTIF standards may be realized. Object detection performance based on the fusion of multiple sensors may be maintained, and the effectiveness of perception modules on a vehicle may be increased as compared to single sensor object detection techniques. The fusion of object detection results from different types of sensors may enable the detection of smaller objects, or other objects outside of the training of image processing models. Robust environment models may be generated based on the improved object detection and the redundant signal paths. Other advantages may also be realized.
Referring to
Collectively, and under the control of the ECU 140, the various sensors 121-124 may be used to provide a variety of different types of driver assistance functionalities. For example, the sensors 121-124 and the ECU 140 may provide blind spot monitoring, adaptive cruise control, collision prevention assistance, lane departure protection, and/or rear collision mitigation.
The CAN bus 150 may be treated by the ECU 140 as a sensor that provides ego vehicle parameters to the ECU 140. For example, a GPS module may also be connected to the ECU 140 as a sensor, providing geolocation parameters to the ECU 140.
Referring also to
The configuration of the device 200 shown in
The device 200 may comprise the modem processor 232 that may be capable of performing baseband processing of signals received and down-converted by the transceiver 215 and/or the SPS receiver 217. The modem processor 232 may perform baseband processing of signals to be upconverted for transmission by the transceiver 215. Also or alternatively, baseband processing may be performed by the general-purpose/application processor 230 and/or the DSP 231. Other configurations, however, may be used to perform baseband processing.
The device 200 may include the sensor(s) 213 that may include, for example, one or more of various types of sensors such as one or more inertial sensors, one or more magnetometers, one or more environment sensors, one or more optical sensors, one or more weight sensors, and/or one or more radio frequency (RF) sensors, etc. An inertial measurement unit (IMU) may comprise, for example, one or more accelerometers (e.g., collectively responding to acceleration of the device 200 in three dimensions) and/or one or more gyroscopes (e.g., three-dimensional gyroscope(s)). The sensor(s) 213 may include one or more magnetometers (e.g., three-dimensional magnetometer(s)) to determine orientation (e.g., relative to magnetic north and/or true north) that may be used for any of a variety of purposes, e.g., to support one or more compass applications. The environment sensor(s) may comprise, for example, one or more temperature sensors, one or more barometric pressure sensors, one or more ambient light sensors, one or more camera imagers, and/or one or more microphones, etc. The sensor(s) 213 may generate analog and/or digital signals indications of which may be stored in the memory 211 and processed by the DSP 231 and/or the general-purpose/application processor 230 in support of one or more applications such as, for example, applications directed to positioning and/or navigation operations.
The sensor(s) 213 may be used in relative location measurements, relative location determination, motion determination, etc. Information detected by the sensor(s) 213 may be used for motion detection, relative displacement, dead reckoning, sensor-based location determination, and/or sensor-assisted location determination. The sensor(s) 213 may be useful to determine whether the device 200 is fixed (stationary) or mobile and/or whether to report certain useful information, e.g., to an LMF (Location Management Function) regarding the mobility of the device 200. For example, based on the information obtained/measured by the sensor(s) 213, the device 200 may notify/report to the LMF that the device 200 has detected movements or that the device 200 has moved, and may report the relative displacement/distance (e.g., via dead reckoning, or sensor-based location determination, or sensor-assisted location determination enabled by the sensor(s) 213). In another example, for relative positioning information, the sensors/IMU may be used to determine the angle and/or orientation of another object (e.g., another device) with respect to the device 200, etc.
The IMU may be configured to provide measurements about a direction of motion and/or a speed of motion of the device 200, which may be used in relative location determination. For example, one or more accelerometers and/or one or more gyroscopes of the IMU may detect, respectively, a linear acceleration and a speed of rotation of the device 200. The linear acceleration and speed of rotation measurements of the device 200 may be integrated over time to determine an instantaneous direction of motion as well as a displacement of the device 200. The instantaneous direction of motion and the displacement may be integrated to track a location of the device 200. For example, a reference location of the device 200 may be determined, e.g., using the SPS receiver 217 (and/or by some other means) for a moment in time and measurements from the accelerometer(s) and gyroscope(s) taken after this moment in time may be used in dead reckoning to determine present location of the device 200 based on movement (direction and distance) of the device 200 relative to the reference location.
The magnetometer(s) may determine magnetic field strengths in different directions which may be used to determine orientation of the device 200. For example, the orientation may be used to provide a digital compass for the device 200. The magnetometer(s) may include a two-dimensional magnetometer configured to detect and provide indications of magnetic field strength in two orthogonal dimensions. The magnetometer(s) may include a three-dimensional magnetometer configured to detect and provide indications of magnetic field strength in three orthogonal dimensions. The magnetometer(s) may provide means for sensing a magnetic field and providing indications of the magnetic field, e.g., to the processor 210.
The transceiver 215 may include a wireless transceiver 240 and a wired transceiver 250 configured to communicate with other devices through wireless connections and wired connections, respectively. For example, the wireless transceiver 240 may include a wireless transmitter 242 and a wireless receiver 244 coupled to an antenna 246 for transmitting (e.g., on one or more uplink channels and/or one or more sidelink channels) and/or receiving (e.g., on one or more downlink channels and/or one or more sidelink channels) wireless signals 248 and transducing signals from the wireless signals 248 to guided (e.g., wired electrical and/or optical) signals and from guided (e.g., wired electrical and/or optical) signals to the wireless signals 248. The wireless transmitter 242 includes appropriate components (e.g., a power amplifier and a digital-to-analog converter). The wireless receiver 244 includes appropriate components (e.g., one or more amplifiers, one or more frequency filters, and an analog-to-digital converter). The wireless transmitter 242 may include multiple transmitters that may be discrete components or combined/integrated components, and/or the wireless receiver 244 may include multiple receivers that may be discrete components or combined/integrated components. The wireless transceiver 240 may be configured to communicate signals (e.g., with TRPs and/or one or more other devices) according to a variety of radio access technologies (RATs) such as 5G New Radio (NR), GSM (Global System for Mobiles), UMTS (Universal Mobile Telecommunications System), AMPS (Advanced Mobile Phone System), CDMA (Code Division Multiple Access), WCDMA (Wideband CDMA), LTE (Long Term Evolution), LTE Direct (LTE-D), 3GPP LTE-V2X (PC5), IEEE 802.11 (including IEEE 802.11p), WiFi® short-range wireless communication technology, WiFi® Direct (WiFi-D), Bluetooth® short-range wireless communication technology, Zigbee® short-range wireless communication technology, etc. New Radio may use mm-wave frequencies and/or sub-6 GHZ frequencies. The wired transceiver 250 may include a wired transmitter 252 and a wired receiver 254 configured for wired communication, e.g., a network interface that may be utilized to communicate with an NG-RAN (Next Generation-Radio Access Network) to send communications to, and receive communications from, the NG-RAN. The wired transmitter 252 may include multiple transmitters that may be discrete components or combined/integrated components, and/or the wired receiver 254 may include multiple receivers that may be discrete components or combined/integrated components. The wired transceiver 250 may be configured, e.g., for optical communication and/or electrical communication. The transceiver 215 may be communicatively coupled to the transceiver interface 214, e.g., by optical and/or electrical connection. The transceiver interface 214 may be at least partially integrated with the transceiver 215. The wireless transmitter 242, the wireless receiver 244, and/or the antenna 246 may include multiple transmitters, multiple receivers, and/or multiple antennas, respectively, for sending and/or receiving, respectively, appropriate signals.
The user interface 216 may comprise one or more of several devices such as, for example, a speaker, microphone, display device, vibration device, keyboard, touch screen, etc. The user interface 216 may include more than one of any of these devices. The user interface 216 may be configured to enable a user to interact with one or more applications hosted by the device 200. For example, the user interface 216 may store indications of analog and/or digital signals in the memory 211 to be processed by DSP 231 and/or the general-purpose/application processor 230 in response to action from a user. Similarly, applications hosted on the device 200 may store indications of analog and/or digital signals in the memory 211 to present an output signal to a user. The user interface 216 may include an audio input/output (I/O) device comprising, for example, a speaker, a microphone, digital-to-analog circuitry, analog-to-digital circuitry, an amplifier and/or gain control circuitry (including more than one of any of these devices). Other configurations of an audio I/O device may be used. Also or alternatively, the user interface 216 may comprise one or more touch sensors responsive to touching and/or pressure, e.g., on a keyboard and/or touch screen of the user interface 216.
The SPS receiver 217 (e.g., a Global Positioning System (GPS) receiver) may be capable of receiving and acquiring SPS signals 260 via an SPS antenna 262. The SPS antenna 262 is configured to transduce the SPS signals 260 from wireless signals to guided signals, e.g., wired electrical or optical signals, and may be integrated with the antenna 246. The SPS receiver 217 may be configured to process, in whole or in part, the acquired SPS signals 260 for estimating a location of the device 200. For example, the SPS receiver 217 may be configured to determine location of the device 200 by trilateration using the SPS signals 260. The general-purpose/application processor 230, the memory 211, the DSP 231 and/or one or more specialized processors (not shown) may be utilized to process acquired SPS signals, in whole or in part, and/or to calculate an estimated location of the device 200, in conjunction with the SPS receiver 217. The memory 211 may store indications (e.g., measurements) of the SPS signals 260 and/or other signals (e.g., signals acquired from the wireless transceiver 240) for use in performing positioning operations. The general-purpose/application processor 230, the DSP 231, and/or one or more specialized processors, and/or the memory 211 may provide or support a location engine for use in processing measurements to estimate a location of the device 200.
The device 200 may include the camera 218 for capturing still or moving imagery. The camera 218 may comprise, for example, an imaging sensor (e.g., a charge coupled device or a CMOS (Complementary Metal-Oxide Semiconductor) imager), a lens, analog-to-digital circuitry, frame buffers, etc. Additional processing, conditioning, encoding, and/or compression of signals representing captured images may be performed by the general-purpose/application processor 230 and/or the DSP 231. Also or alternatively, the video processor 233 may perform conditioning, encoding, compression, and/or manipulation of signals representing captured images. The video processor 233 may decode/decompress stored image data for presentation on a display device (not shown), e.g., of the user interface 216.
The position device (PD) 219 may be configured to determine a position of the device 200, motion of the device 200, and/or relative position of the device 200, and/or time. For example, the PD 219 may communicate with, and/or include some or all of, the SPS receiver 217. The PD 219 may work in conjunction with the processor 210 and the memory 211 as appropriate to perform at least a portion of one or more positioning methods, although the description herein may refer to the PD 219 being configured to perform, or performing, in accordance with the positioning method(s). The PD 219 may also or alternatively be configured to determine location of the device 200 using terrestrial-based signals (e.g., at least some of the wireless signals 248) for trilateration, for assistance with obtaining and using the SPS signals 260, or both. The PD 219 may be configured to determine location of the device 200 based on a coverage area of a serving base station and/or another technique such as E-CID. The PD 219 may be configured to use one or more images from the camera 218 and image recognition combined with known locations of landmarks (e.g., natural landmarks such as mountains and/or artificial landmarks such as buildings, bridges, streets, etc.) to determine location of the device 200. The PD 219 may be configured to use one or more other techniques (e.g., relying on the UE's self-reported location (e.g., part of the UE's position beacon)) for determining the location of the device 200, and may use a combination of techniques (e.g., SPS and terrestrial positioning signals) to determine the location of the device 200. The PD 219 may include one or more of the sensors 213 (e.g., gyroscope(s), accelerometer(s), magnetometer(s), etc.) that may sense orientation and/or motion of the device 200 and provide indications thereof that the processor 210 (e.g., the general-purpose/application processor 230 and/or the DSP 231) may be configured to use to determine motion (e.g., a velocity vector and/or an acceleration vector) of the device 200. The PD 219 may be configured to provide indications of uncertainty and/or error in the determined position and/or motion. Functionality of the PD 219 may be provided in a variety of manners and/or configurations, e.g., by the general-purpose/application processor 230, the transceiver 215, the SPS receiver 217, and/or another component of the device 200, and may be provided by hardware, software, firmware, or various combinations thereof.
Referring also to
The description herein may refer to the processor 310 performing a function, but this includes other implementations such as where the processor 310 executes software and/or firmware. The description herein may refer to the processor 310 performing a function as shorthand for one or more of the processors contained in the processor 310 performing the function. The description herein may refer to the TRP 300 performing a function as shorthand for one or more appropriate components (e.g., the processor 310 and the memory 311) of the TRP 300 performing the function. The processor 310 may include a memory with stored instructions in addition to and/or instead of the memory 311. Functionality of the processor 310 is discussed more fully below.
The transceiver 315 may include a wireless transceiver 340 and/or a wired transceiver 350 configured to communicate with other devices through wireless connections and wired connections, respectively. For example, the wireless transceiver 340 may include a wireless transmitter 342 and a wireless receiver 344 coupled to one or more antennas 346 for transmitting (e.g., on one or more uplink channels and/or one or more downlink channels) and/or receiving (e.g., on one or more downlink channels and/or one or more uplink channels) wireless signals 348 and transducing signals from the wireless signals 348 to guided (e.g., wired electrical and/or optical) signals and from guided (e.g., wired electrical and/or optical) signals to the wireless signals 348. Thus, the wireless transmitter 342 may include multiple transmitters that may be discrete components or combined/integrated components, and/or the wireless receiver 344 may include multiple receivers that may be discrete components or combined/integrated components. The wireless transceiver 340 may be configured to communicate signals (e.g., with the device 200, one or more other UEs, and/or one or more other devices) according to a variety of radio access technologies (RATs) such as 5G New Radio (NR), GSM (Global System for Mobiles), UMTS (Universal Mobile
Telecommunications System), AMPS (Advanced Mobile Phone System), CDMA (Code Division Multiple Access), WCDMA (Wideband CDMA), LTE (Long Term Evolution), LTE Direct (LTE-D), 3GPP LTE-V2X (PC5), IEEE 802.11 (including IEEE 802.11p), WiFi® short-range wireless communication technology, WiFi® Direct (WiFi®-D), Bluetooth® short-range wireless communication technology, Zigbee® short-range wireless communication technology, etc. The wired transceiver 350 may include a wired transmitter 352 and a wired receiver 354 configured for wired communication, e.g., a network interface that may be utilized to communicate with an NG-RAN to send communications to, and receive communications from, an LMF, for example, and/or one or more other network entities. The wired transmitter 352 may include multiple transmitters that may be discrete components or combined/integrated components, and/or the wired receiver 354 may include multiple receivers that may be discrete components or combined/integrated components. The wired transceiver 350 may be configured, e.g., for optical communication and/or electrical communication.
The configuration of the TRP 300 shown in
Referring also to
The transceiver 415 may include a wireless transceiver 440 and/or a wired transceiver 450 configured to communicate with other devices through wireless connections and wired connections, respectively. For example, the wireless transceiver 440 may include a wireless transmitter 442 and a wireless receiver 444 coupled to one or more antennas 446 for transmitting (e.g., on one or more downlink channels) and/or receiving (e.g., on one or more uplink channels) wireless signals 448 and transducing signals from the wireless signals 448 to guided (e.g., wired electrical and/or optical) signals and from guided (e.g., wired electrical and/or optical) signals to the wireless signals 448. Thus, the wireless transmitter 442 may include multiple transmitters that may be discrete components or combined/integrated components, and/or the wireless receiver 444 may include multiple receivers that may be discrete components or combined/integrated components. The wireless transceiver 440 may be configured to communicate signals (e.g., with the device 200, one or more other UEs, and/or one or more other devices) according to a variety of radio access technologies (RATs) such as 5G New Radio (NR), GSM (Global System for Mobiles), UMTS (Universal Mobile Telecommunications System), AMPS (Advanced Mobile Phone System), CDMA (Code Division Multiple Access), WCDMA (Wideband CDMA), LTE (Long Term Evolution), LTE Direct (LTE-D), 3GPP LTE-V2X (PC5), IEEE 802.11 (including IEEE 802.11p), WiFi® short-range wireless communication technology, WiFi® Direct (WiFi®-D), Bluetooth® short-range wireless communication technology, Zigbee® short-range wireless communication technology, etc. The wired transceiver 450 may include a wired transmitter 452 and a wired receiver 454 configured for wired communication, e.g., a network interface that may be utilized to communicate with an NG-RAN to send communications to, and receive communications from, the TRP 300, for example, and/or one or more other network entities. The wired transmitter 452 may include multiple transmitters that may be discrete components or combined/integrated components, and/or the wired receiver 454 may include multiple receivers that may be discrete components or combined/integrated components. The wired transceiver 450 may be configured, e.g., for optical communication and/or electrical communication.
The description herein may refer to the processor 410 performing a function, but this includes other implementations such as where the processor 410 executes software (stored in the memory 411) and/or firmware. The description herein may refer to the server 400 performing a function as shorthand for one or more appropriate components (e.g., the processor 410 and the memory 411) of the server 400 performing the function.
The configuration of the server 400 shown in
Referring to
The description herein may refer to the processor 510 performing a function, but this includes other implementations such as where the processor 510 executes software (stored in the memory 530) and/or firmware. The description herein may refer to the device 500 performing a function as shorthand for one or more appropriate components (e.g., the processor 510 and the memory 530) of the device 500 performing the function. The processor 510 (possibly in conjunction with the memory 530 and, as appropriate, the transceiver 520) may include an occupancy grid unit 560 (which may include an ADAS (Advanced Driver Assistance System) for a VUE). The occupancy grid unit 560 is discussed further herein, and the description herein may refer to the occupancy grid unit 560 performing one or more functions, and/or may refer to the processor 510 generally, or the device 500 generally, as performing any of the functions of the occupancy grid unit 560, with the device 500 being configured to perform the functions.
One or more functions performed by the device 500 (e.g., the occupancy grid unit 560) may be performed by another entity. For example, sensor measurements (e.g., radar measurements, camera measurements (e.g., pixels, images)) and/or processed sensor measurements (e.g., a camera image converted to a bird's-eye-view image) may be provided to another entity, e.g., the server 400, and the other entity may perform one or more functions discussed herein with respect to the occupancy grid unit 560 (e.g., using machine learning to determine a present occupancy grid and/or applying an observation model, analyzing measurements from different sensors, to determine a present occupancy grid, etc.).
Referring also to
Referring also to
Each of the sub-regions 710 may correspond to a respective cell 810 of the occupancy map and information may be obtained regarding what, if anything, occupies each of the sub-regions 710 and whether an occupying object is static or dynamic in order to populate cells 810 of the occupancy grid 800 with probabilities of the cell being occupied (O) or free (F) (i.e., unoccupied), and probabilities of an object at least partially occupying a cell being static (S) or dynamic (D). Each of the probabilities may be a floating point value. The information as to what, if anything, occupies each of the sub-regions 710 may be obtained from a variety of sources. For example, occupancy information may be obtained from sensor measurements from the sensors 540 of the device 500. As another example, occupancy information may be obtained by one or more other devices and communicated to the device 500. For example, one or more of the vehicles 602-609 may communicate, e.g., via C-V2X communications, occupancy information to the vehicle 601. As another example, the RSU 612 may gather occupancy information (e.g., from one or more sensors of the RSU 612 and/or from communication with one or more of the vehicles 602-609 and/or one or more other devices) and communicate the gathered information to the vehicle 601, e.g., directly and/or through one or more network entities, e.g., TRPs.
As shown in
Building a dynamic occupancy grid (an occupancy grid with a dynamic occupier type) may be helpful, or even essential, for understanding an environment (e.g., the environment 600) of an apparatus to facilitate or even enable further processing. For example, a dynamic occupancy grid may be helpful for predicting occupancy, for motion planning, etc. A dynamic occupancy grid may, at any one time, comprise one or more cells of static occupier type and/or one or more cells of dynamic occupier type. A dynamic object may be represented as a set of one or more velocity vectors. For example, an occupancy grid cell may have some or all of the occupancy probability be dynamic, and within the dynamic occupancy probability, there may be multiple (e.g., four) velocity vectors each with a corresponding probability that together sum to the dynamic occupancy probability for that cell 810. A dynamic occupancy grid may be obtained, e.g., by the occupancy grid unit 560, by processing information from multiple sensors, e.g., of the sensors 540, such as from a radar system. Adding data from one or more cameras to determine the dynamic occupancy grid may provide significant improvements to the grid, e.g., accuracy of probabilities and/or velocities in grid cells.
Referring to
The robust fusion functional block 910 is configured to fuse the parametric and non-parametric representations into one or more object lists for the environment model 912. In an example, the fusion process in the robust fusion functional block 910 may utilize the object coordinate information in the parametric representations received from the LLP functional block 906 and the locations of the cells in the non-parametric representations received from the DoG functional block 908. The robust fusion functional block 910 may be configured to identify clusters within the non-parametric representations (e.g., clusters of dynamic grid cells) with similar properties, e.g., similar object classifications and/or similar velocities, and the indications of identified objects in the parametric representations (e.g., from the LLP functional block 906) to track objects, e.g., using a Kalman Filter (and/or one or more other algorithms). The robust fusion functional block 910 may be configured to output an object track list indicating tracked objects to the environment model 912. The object track list may include a location, velocity, length, and width (and possibly other information) for each object in the object track list. The object track list may include a shape to represent each object, e.g., a closed polygon or other shape (e.g., an oval (e.g., indicated by values for the major and minor axes)). The robust fusion functional block 910 may be configured to determine static objects (e.g., road boundaries, traffic signs, etc.) based on the parametric and/or non-parametric representations, and provide static object information to the environment model 912.
Referring to
implementing object detection signal redundancy is shown. The system 900 may be configured to utilize one or more features in the architecture 1000. In an example, the architecture 1000 may be implemented in one or more software modules 1006 configured to receive signals from the radar module 902 and the camera 904. The software module 1006 may include a first signal path 1008 configured to generate parametric representations, and a second signal path 1010 configured to generate non-parametric representations. The first and second signal paths 1008, 1010 may each include one or more modules including machine learning models configured to output the parametric and non-parametric representations respectively. The machine learning models may be based on deep learning techniques. For example, the first signal path 1008 may include a camera deep learning (DL) detection module 1012, a low-level (LL) fusion objects module 1014, and a radar detections module 1016. One or more of the modules 1012, 1014, 1016 may be trained to output parametric representations based on the inputs received from the radar module 902 and/or the camera 904. The second signal path 1010 may include a camera drivable space module 1018, a camera based semantic segmentation (Camer SemSeg) module 1020, a radar point cloud module 1022, and a low-level bird's eye view (BEV) segmentation and occupancy flow module 1024. One or more of the modules 1018, 1020, 1022, 1024 may be trained to output non-parametric representations based on the inputs received from the radar module 902 and/or the camera 904. The number of modules and types of machine learning models shown are examples, and not limitations, as other modules and machine learning techniques may also be used in the signal paths to generate the respective parametric and non-parametric representations.
The architecture 1000 may include a fusion functional block 1026 including an object tracking module 1030 and an occupancy grid module 1032. The object tracking module 1030 may be configured to use the parametric representations received via the first signal path 1008 with non-parametric representations received from the second signal path 1010 to track objects, e.g., using a Kalman Filter (and/or one or more other algorithms) and output an object track list indicating tracked objects to an environment model. The object track list may include a location, velocity, length, and width (and possibly other information) for each object in the object track list. The object track list may include a shape to represent each object, e.g., a closed polygon or other shape (e.g., an oval (e.g., indicated by values for the major and minor axes)). The occupancy grid module 1032 may be configured to determine static objects (e.g., road boundaries, traffic signs, etc.) in the parametric and/or non-parametric representations provided by the respective first signal path 1008 and the second signal path 1010. The static object information may be provided to an environment model.
Referring to
Referring to
In a first DL architecture 1200, a common backbone 1202 may be configured to receive signals from the radar module 902 and the camera 904. A first set of neck models 1204 and a first set of head models 1206 may be configured to generate the parametric and/or non-parametric representations associated with one or more of the modules 1012, 1014, 1016, 1018, 1020, 1022, 1024. In an example, separate training for each head model may be enforced for handling redundancy. In a second DL architecture 1250, the backbone, neck and head models may be separated based on the first and second signal paths. For example, a first backbone 1252 may be configured to receive signal inputs from the radar module 902 and the camera 904, and a second set of neck models 1254 and a second set of head models 1256 may be trained to generate the parametric representations. A second backbone 1258 may also be configured to receive signal inputs from the radar module 902 and the camera 904, and a third set of neck models 1260 and a third set of head models 1262 may be trained to generate the non-parametric representations. Other deep learning architectures may also be used to generate redundancy in the signal flows and improve the robustness of the object detection functions.
Referring to
At stage 1302, the method 1300 includes obtaining image information with at least one camera module disposed on a vehicle. The device 500, including the processor 510 and the sensors 540, is a means for obtaining the image information. In an example, the cameras 544 may obtain images of the environment proximate to the vehicle. The images may include static objects, such as road signs, trees, barriers and other non-moving objects, and dynamic objects such as other vehicles, bicycles, pedestrians, and other moving objects. In an example, the image information may be obtained at a frame rate of approximately 40 ms. Other frame rates may be used.
At stage 1304, the method 1300 includes obtaining target information from at least one radar module disposed on the vehicle. The device 500, including the processor 510 and the sensors 540, is a means for obtaining the target information. In an example, the one or more radar sensors 542 may be configured to provide range, bearing and velocity information for objects generating a return radar signal (e.g., radar echo). In an example, the target information may be a scope plot based on echo signals. The radar target information may be obtained at a frame rate of approximately 40 ms. Other frame rates may be used.
At stage 1306, the method 1300 includes generating a first detection representation with a first signal path based on the image information and the target information. The device 500, including the processor 510, the sensors 540 and the architecture 1000, is a means for generating the first detection representation. In an example, the first detection representation includes one or more of the parametric representations generated via the first signal path 1008. The first signal path 1008 may include one or more machine learning models, such as described in the modules 1012, 1014, 1016 to generate the parametric representations. These modules and the corresponding parametric representations are examples, and not limitations, as the first signal path 1008 may utilize other modules to generate other detection representations. At stage 1308, the method 1300 includes generating a second detection
representation with a second signal path based on the image information and the target information, wherein the second signal path is different than the first signal path. The device 500, including the processor 510, the sensors 540 and the architecture 1000, is a means for generating the second detection representation. In an example, the second detection representation includes one or more of the non-parametric representations generated via the second signal path 1010. The second signal path 1010 may include one or more machine learning models, such as described in the modules 1018, 1020, 1022, 1024 to generate the non-parametric representations. These modules and the corresponding non-parametric representations are examples, and not limitations, as the second signal path 1010 may utilize other modules to generate other detection representations.
At stage 1310, the method 1300 includes outputting the first detection representation and the second detection representation. The device 500, including the processor 510, the sensors 540 and the architecture 1000, is a means for outputting the detection representations. In an example, the first detection representation and the second detection representation may be output to a fusion module configured to generate object lists based on the first and second representations. Other modules in an autonomous vehicle perception architecture may be configured to receive the first detection representation and the second detection representation (e.g., prior to fusing).
Referring to
At stage 1402, the method 1400 includes receiving a first detection representation via a first signal path and a second detection representation via a second signal path. The device 500, including the processor 510, the sensors 540 and the architecture 1000, is a means for receiving the first and second detection representation. In an example, the fusion functional block 1026 including an object tracking module 1030 and an occupancy grid module 1032 is configured to receive the parametric and non-parametric representations as the respective first and second detection representations.
At stage 1404, the method 1400 includes generating one or more object lists based at least in part on the first detection representation and the second detection representation. The device 500, including the processor 510, the sensors 540 and the architecture 1000, is a means for generating the one or more object lists. The object tracking module 1030 may be configured to use the parametric representations received via the first signal path 1008 with non-parametric representations received from the second signal path 1010 to track objects, e.g., using a Kalman Filter (and/or one or more other algorithms) and generate an object track list indicating tracked objects. The object track list may include a location, velocity, length, and width (and possibly other information) for each object in the object track list. The object track list may include a shape to represent each object, e.g., a closed polygon or other shape (e.g., an oval (e.g., indicated by values for the major and minor axes)). The occupancy grid module 1032 may be configured to determine static objects (e.g., road boundaries, traffic signs, etc.) in the parametric and/or non-parametric representations provided by the respective first signal path 1008 and second signal path 1010 and generate static object information.
At stage 1406, the method 1400 includes outputting the one or more object lists. The device 500, including the processor 510, the sensors 540 and the architecture 1000, is a means for outputting the one or more object lists. In an example, the fusion functional block 1026 may be configured to output the object track list and the static object information to an environment model. Other modules in an autonomous vehicle perception architecture may be configured to receive the one or more object lists (e.g., based on the fusion of the parametric and non-parametric representations of the sensor information).
Other examples and implementations are within the scope of the disclosure and appended claims. For example, due to the nature of software and computers, functions described above can be implemented using software executed by a processor, hardware, firmware, hardwiring, or a combination of any of these. Features implementing functions may also be physically located at various positions, including being distributed such that portions of functions are implemented at different physical locations.
As used herein, the singular forms “a,” “an,” and “the” include the plural forms as well, unless the context clearly indicates otherwise. Thus, reference to a device in the singular (e.g., “a device,” “the device”), including in the claims, includes at least one, i.e., one or more, of such devices (e.g., “a processor” includes at least one processor (e.g., one processor, two processors, etc.), “the processor” includes at least one processor, “a memory” includes at least one memory, “the memory” includes at least one memory, etc.). The phrases “at least one” and “one or more” are used interchangeably and such that “at least one” referred-to object and “one or more” referred-to objects include implementations that have one referred-to object and implementations that have multiple referred-to objects. For example, “at least one processor” and “one or more processors” each includes implementations that have one processor and implementations that have multiple processors. Also, a “set” as used herein includes one or more members, and a “subset” contains fewer than all members of the set to which the subset refers.
The terms “comprises,” “comprising,” “includes,” and/or “including,” as used herein, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
Also, as used herein, a list of items prefaced by “at least one of” or prefaced by “one or more of” indicates a disjunctive list such that, for example, a list of “at least one of A, B, or C,” or a list of “at least one of A, B, and C,” or a list of “one or more of A, B, or C”, or a list of “one or more of A, B, and C,” or a list of “A or B or C” means A, or B, or C, or AB (A and B), or AC (A and C), or BC (B and C), or ABC (i.e., A and B and C), or combinations with more than one feature (e.g., AA, AAB, ABBC, etc.). Thus, a recitation that an item, e.g., a processor, is configured to perform a function regarding at least one of A or B, or a recitation that an item is configured to perform a function A or a function B, means that the item may be configured to perform the function regarding A, or may be configured to perform the function regarding B, or may be configured to perform the function regarding A and B. For example, a phrase of “a processor configured to measure at least one of A or B” or “a processor configured to measure A or measure B” means that the processor may be configured to measure A (and may or may not be configured to measure B), or may be configured to measure B (and may or may not be configured to measure A), or may be configured to measure A and measure B (and may be configured to select which, or both, of A and B to measure). Similarly, a recitation of a means for measuring at least one of A or B includes means for measuring A (which may or may not be able to measure B), or means for measuring B (and may or may not be configured to measure A), or means for measuring A and B (which may be able to select which, or both, of A and B to measure). As another example, a recitation that an item, e.g., a processor, is configured to at least one of perform function X or perform function Y means that the item may be configured to perform the function X, or may be configured to perform the function Y, or may be configured to perform the function X and to perform the function Y. For example, a phrase of “a processor configured to at least one of measure X or measure Y” means that the processor may be configured to measure X (and may or may not be configured to measure Y), or may be configured to measure Y (and may or may not be configured to measure X), or may be configured to measure X and to measure Y (and may be configured to select which, or both, of X and Y to measure).
As used herein, unless otherwise stated, a statement that a function or operation is “based on” an item or condition means that the function or operation is based on the stated item or condition and may be based on one or more items and/or conditions in addition to the stated item or condition.
Substantial variations may be made in accordance with specific requirements. For example, customized hardware might also be used, and/or particular elements might be implemented in hardware, software (including portable software, such as applets, etc.) executed by a processor, or both. Further, connection to other computing devices such as network input/output devices may be employed. Components, functional or otherwise, shown in the figures and/or discussed herein as being connected or communicating with each other are communicatively coupled unless otherwise noted. That is, they may be directly or indirectly connected to enable communication between them.
The systems and devices discussed above are examples. Various configurations may omit, substitute, or add various procedures or components as appropriate. For instance, features described with respect to certain configurations may be combined in various other configurations. Different aspects and elements of the configurations may be combined in a similar manner. Also, technology evolves and, thus, many of the elements are examples and do not limit the scope of the disclosure or claims.
Specific details are given in the description herein to provide a thorough understanding of example configurations (including implementations). However, configurations may be practiced without these specific details. For example, well-known circuits, processes, algorithms, structures, and techniques have been shown without unnecessary detail in order to avoid obscuring the configurations. The description herein provides example configurations, and does not limit the scope, applicability, or configurations of the claims. Rather, the preceding description of the configurations provides a description for implementing described techniques. Various changes may be made in the function and arrangement of elements.
The terms “processor-readable medium,” “machine-readable medium,” and “computer-readable medium,” as used herein, refer to any medium that participates in providing data that causes a machine to operate in a specific fashion. Using a computing platform, various processor-readable media might be involved in providing instructions/code to processor(s) for execution and/or might be used to store and/or carry such instructions/code (e.g., as signals). In many implementations, a processor-readable medium is a physical and/or tangible storage medium. Such a medium may take many forms, including but not limited to, non-volatile media and volatile media. Non-volatile media include, for example, optical and/or magnetic disks. Volatile media include, without limitation, dynamic memory.
Having described several example configurations, various modifications, alternative constructions, and equivalents may be used. For example, the above elements may be components of a larger system, wherein other rules may take precedence over or otherwise modify the application of the disclosure. Also, a number of operations may be undertaken before, during, or after the above elements are considered. Accordingly, the above description does not bound the scope of the claims.
Unless otherwise indicated, “about” and/or “approximately” as used herein when referring to a measurable value such as an amount, a temporal duration, and the like, encompasses variations of ±20% or ±10%, ±5%, or ±0.1% from the specified value, as appropriate in the context of the systems, devices, circuits, methods, and other implementations described herein. Unless otherwise indicated, “substantially” as used herein when referring to a measurable value such as an amount, a temporal duration, a physical attribute (such as frequency), and the like, also encompasses variations of ±20% or ±10%, ±5%, or ±0.1% from the specified value, as appropriate in the context of the systems, devices, circuits, methods, and other implementations described herein.
A statement that a value exceeds (or is more than or above) a first threshold value is equivalent to a statement that the value meets or exceeds a second threshold value that is slightly greater than the first threshold value, e.g., the second threshold value being one value higher than the first threshold value in the resolution of a computing system. A statement that a value is less than (or is within or below) a first threshold value is equivalent to a statement that the value is less than or equal to a second threshold value that is slightly lower than the first threshold value, e.g., the second threshold value being one value lower than the first threshold value in the resolution of a computing system.
Implementation examples are described in the following numbered clauses:
Clause 1. A method for generating object representations with multiple signal paths, comprising: obtaining image information from at least one camera module disposed on a vehicle; obtaining target information from at least one radar module disposed on the vehicle; generating a first detection representation with a first signal path based on the image information and the target information; generating a second detection representation with a second signal path based on the image information and the target information, wherein the second signal path is different than the first signal path; and outputting the first detection representation and the second detection representation.
Clause 2. The method of clause 1 wherein the first detection representation includes a parametric representation for a target object, and the second detection representation includes a non-parametric representation for the target object.
Clause 3. The method of clause 2 wherein the parametric representation for the target object includes coordinate information for the target object and dimension information for the target object.
Clause 4. The method of clause 2 wherein the non-parametric representation for the target object is an occupancy map.
Clause 5. The method of clause 2 wherein the first signal path includes at least a first machine learning model configured to generate the parametric representation based at least in part on the image information and the target information, and the second signal path includes at least a second machine learning model configured to generate the non-parametric representation based at least in part on the image information and the target information.
Clause 6. The method of clause 5 wherein the first machine learning model and the second machine learning model utilize a common backbone.
Clause 7. The method of clause 5 wherein the first machine learning model utilizes at least a first backbone, and the second machine learning model utilize a least a second backbone.
Clause 8. The method of clause 1 further comprising: receiving the first detection representation via the first signal path and the second detection representation via the second signal path; generating one or more object lists based at least in part on the first detection representation and the second detection representation; and outputting the one or more object lists.
Clause 9. The method of clause 8 wherein the one or more object lists includes an object track list indicating a location and velocity of an object.
Clause 10. The method of clause 9 wherein the object track list indicates a shape of the object.
Clause 11. The method of clause 9 wherein the one or more object lists includes static object information.
Clause 12. The method of clause 8 wherein outputting the one or more object lists includes providing the one or more object lists to an environment model.
Clause 13. The method of clause 8 further comprising: receiving target information from a lidar module disposed on the vehicle via a secondary path that is separate from the first signal path and the second signal path; generating object detection information based on the target information; and outputting the object detection information.
Clause 14. The method of clause 13 further comprising: receiving image information from the at least one camera module disposed on the vehicle; generating the object detection information based on the target information and the image information; and outputting the object detection information.
Clause 15. An apparatus, comprising: at least one memory; at least one camera module; at least one radar module; at least one processor communicatively coupled to the at least one memory, the at least one camera, and the at least one radar module, and configured to: obtain image information from the at least one camera module disposed on a vehicle; obtain target information from the at least one radar module disposed on the vehicle; generate a first detection representation with a first signal path based on the image information and the target information; generate a second detection representation with a second signal path based on the image information and the target information, wherein the second signal path is different than the first signal path; and output the first detection representation and the second detection representation.
Clause 16. The apparatus of clause 15 wherein the first detection representation includes a parametric representation for a target object, and the second detection representation includes a non-parametric representation for the target object.
Clause 17. The apparatus of clause 16 wherein the parametric representation for the target object includes coordinate information for the target object and dimension information for the target object.
Clause 18. The apparatus of clause 16 wherein the non-parametric representation for the target object is an occupancy map.
Clause 19. The apparatus of clause 16 wherein the first signal path includes at least a first machine learning model and the at least one processor is further configured to generate the parametric representation based at least in part on the image information and the target information, and the second signal path includes at least a second machine learning model and the at least one processor is further configured to generate the non-parametric representation based at least in part on the image information and the target information.
Clause 20. The apparatus of clause 19 wherein the first machine learning model and the second machine learning model utilize a common backbone.
Clause 21. The apparatus of clause 19 wherein the first machine learning model utilizes at least a first backbone, and the second machine learning model utilize a least a second backbone.
Clause 22. The apparatus of clause 15 wherein the at least one processor is further configured to: receive the first detection representation via the first signal path and the second detection representation via the second signal path; generate one or more object lists based at least in part on the first detection representation and the second detection representation; and output the one or more object lists.
Clause 23. The apparatus of clause 22 wherein the one or more object lists includes an object track list indicating a location and velocity of an object.
Clause 24. The apparatus of clause 23 wherein the object track list indicates a shape of the object.
Clause 25. The apparatus of clause 23 wherein the one or more object lists includes static object information.
Clause 26. The apparatus of clause 22 wherein the at least one processor is further configured to output the one or more object lists to an environment model.
Clause 27. The apparatus of clause 22 further comprising at least one lidar module disposed on the vehicle, wherein the at least one processor is further configured to: receive further target information from the at least one lidar module via a secondary path that is separate from the first signal path and the second signal path; generate object detection information based on the further target information; and output the object detection information.
Clause 28. The apparatus of clause 27 wherein the at least one processor is further configured to: generate the object detection information based on the target information and the image information; and output the object detection information.
Clause 29. An apparatus for generating object representations with multiple signal paths, comprising: means for obtaining image information from at least one camera module disposed on a vehicle; means for obtaining target information from at least one radar module disposed on the vehicle; means for generating a first detection representation with a first signal path based on the image information and the target information; means for generating a second detection representation with a second signal path based on the image information and the target information, wherein the second signal path is different than the first signal path; and means for outputting the first detection representation and the second detection representation.
Clause 30. The apparatus of clause 29 further comprising: means for receiving the first detection representation via the first signal path and the second detection representation via the second signal path; means for generating one or more object lists based at least in part on the first detection representation and the second detection representation; and means for outputting the one or more object lists.
Clause 31. The apparatus of clause 30 further comprising: means for receiving target information from a lidar module disposed on the vehicle via a secondary path that is separate from the first signal path and the second signal path; means for generating object detection information based on the target information; and means for outputting the object detection information.
Clause 32. A non-transitory processor-readable storage medium comprising processor-readable instructions configured to cause one or more processors to generate object representations with multiple signal paths, comprising code for: obtaining image information from at least one camera module disposed on a vehicle; obtaining target information from at least one radar module disposed on the vehicle; generating a first detection representation with a first signal path based on the image information and the target information; generating a second detection representation with a second signal path based on the image information and the target information, wherein the second signal path is different than the first signal path; and outputting the first detection representation and the second detection representation.
Clause 33. The non-transitory processor-readable storage medium of clause 32 further comprising: code for receiving the first detection representation via the first signal path and the second detection representation via the second signal path; code for generating one or more object lists based at least in part on the first detection representation and the second detection representation; and code for outputting the one or more object lists.
Clause 34. The non-transitory processor-readable storage medium of clause 33 further comprising: code for receiving target information from a lidar module disposed on the vehicle via a secondary path that is separate from the first signal path and the second signal path; code for generating object detection information based on the target information; and code for outputting the object detection information.
Clause 35. A method for generating object representations with multiple signal paths, comprising: receiving a first detection representation via a first signal path and a second detection representation via a second signal path; generating one or more object lists based at least in part on the first detection representation and the second detection representation; and outputting the one or more object lists.
Clause 36. An apparatus, comprising: at least one memory; at least one camera module; at least one radar module; at least one processor communicatively coupled to the at least one memory, the at least one camera, and the at least one radar module, and configured to: receive a first detection representation via a first signal path and a second detection representation via a second signal path; generate one or more object lists based at least in part on the first detection representation and the second detection representation; and output the one or more object lists.
Clause 37. An apparatus comprising: means for receiving a first detection representation via a first signal path and a second detection representation via a second signal path; means for generating one or more object lists based at least in part on the first detection representation and the second detection representation; and means for outputting the one or more object lists.
Clause 38. A non-transitory processor-readable storage medium comprising processor-readable instructions configured to cause one or more processors to generate object representations with multiple signal paths, comprising code for: receiving a first detection representation via a first signal path and a second detection representation via a second signal path; generating one or more object lists based at least in part on the first detection representation and the second detection representation; and outputting the one or more object lists.
This application claims the benefit of U.S. Provisional Application No. 63/590,899, filed Oct. 17, 2023, entitled “AUTOMATED DRIVING SOTIF VIA SIGNAL REPRESENTATION,” which is assigned to the assignee hereof, and the entire contents of which are hereby incorporated herein by reference for all purposes.
Number | Date | Country | |
---|---|---|---|
63590899 | Oct 2023 | US |