The subject disclosure relates to inter-sensor learning.
Vehicles (e.g., automobiles, trucks, constructions vehicles, farm equipment) increasingly include sensors that obtain information about the vehicle and its environment. An exemplary type of sensor is a camera that obtains images. Multiple cameras may be arranged to obtain a 360 degree view around the perimeter of the vehicle, for example. Another exemplary type of sensor is an audio detector or microphone that obtains sound (i.e., audio signals) external to the vehicle. Additional exemplary sensors include a radio detection and ranging (radar) system and a light detection and ranging (lidar) system. The information obtained by the sensors may augment or automate vehicle systems. Exemplary vehicle systems include collision avoidance, adaptive cruise control, and autonomous driving systems. While the sensors may provide information individually, information from the sensors may also be considered together according to a scheme referred to as sensor fusion. In either case, the information from one sensor may indicate an issue with the detection algorithm of another sensor. Accordingly, it is desirable to provide inter-sensor learning.
In one exemplary embodiment, a method of performing inter-sensor learning includes obtaining a detection of a target based on a first sensor. The method also includes determining whether a second sensor with an overlapping detection range with the first sensor also detects the target, and performing learning to update a detection algorithm used with the second sensor based on the second sensor failing to detect the target.
In addition to one or more of the features described herein, the performing the learning is offline.
In addition to one or more of the features described herein, the method also includes performing online learning to reduce a threshold of detection by the second sensor prior to the performing the learning offline.
In addition to one or more of the features described herein, the method also includes logging data from the first sensor and the second sensor to execute the performing the learning offline based on the performing online learning failing to cause detection of the target by the second sensor.
In addition to one or more of the features described herein, the method also includes determining a cause of the second sensor failing to detect the target and performing the learning based on determining that the cause is based on the detection algorithm.
In addition to one or more of the features described herein, the performing the learning includes a deep learning.
In addition to one or more of the features described herein, the obtaining the detection of the target based on the first sensor includes a microphone detecting the target.
In addition to one or more of the features described herein, the determining whether the second sensor also detects the target includes determining whether a camera also detects the target.
In addition to one or more of the features described herein, the obtaining the detection of the target based on the first sensor and the determining whether the second sensor also detects the target is based on the first sensor and the second sensor being disposed in a vehicle.
In addition to one or more of the features described herein, the method also includes augmenting or automating operation of the vehicle based on the detection of the target.
In another exemplary embodiment, a system to perform inter-sensor learning includes a first sensor. The first sensor is detects a target. The system also includes a second sensor. The second sensor has an overlapping detection range with the first sensor. The system further includes a processor to determine whether the second sensor also detects the target and perform learning to update a detection algorithm used with the second sensor based on the second sensor failing to detect the target.
In addition to one or more of the features described herein, the processor performs the learning is offline.
In addition to one or more of the features described herein, the processor performs online learning to reduce a threshold of detection by the second sensor prior to performing the learning offline.
In addition to one or more of the features described herein, the processor logs data from the first sensor and the second sensor to perform the learning offline based on the online learning failing to cause detection of the target by the second sensor.
In addition to one or more of the features described herein, the processor determines a cause of the second sensor failing to detect the target and perform the learning based on determining that the cause is based on the detection algorithm.
In addition to one or more of the features described herein, the learning includes deep learning.
In addition to one or more of the features described herein, the first sensor is a microphone.
In addition to one or more of the features described herein, the second sensor is a camera.
In addition to one or more of the features described herein, the first sensor and the second sensor are disposed in a vehicle.
In addition to one or more of the features described herein, the processor augments or automates operation of the vehicle based on the detection of the target.
The above features and advantages, and other features and advantages of the disclosure are readily apparent from the following detailed description when taken in connection with the accompanying drawings.
Other features, advantages and details appear, by way of example only, in the following detailed description, the detailed description referring to the drawings in which:
The following description is merely exemplary in nature and is not intended to limit the present disclosure, its application or uses. It should be understood that throughout the drawings, corresponding reference numerals indicate like or corresponding parts and features.
As previously noted, various sensors may be located in a vehicle to obtain information about vehicle operation or the environment around the vehicle. Some sensors (e.g., radar, camera, microphone) may be used to detect objects such as other vehicles, pedestrians, and the like in the vicinity of the vehicle. The detection may be performed by implementing a machine learning algorithm, for example. Each sensor may perform the detection individually. In some cases, sensor fusion may be performed to combine the detection information from two or more sensors. Sensor fusion requires that two or more sensors have the same or at least overlapping fields of view. This ensures that the two or more sensors are positioned to detect the same objects and, thus, detection by one sensor may be used to enhance detection by the other sensors. Whether sensor fusion is performed or not, embodiments described herein relate to using the common field of view of sensors to improve their detection algorithms.
Specifically, embodiments of the systems and methods detailed herein relate to inter-sensor learning. As described, the information from one sensor is used to fine tune the detection algorithm of another sensor. Assuming a common field of view, when one type of sensor indicates that an object has been detected while another type of sensor does not detect the object, a determination must first be made about why the discrepancy happened. In one case, the detection may be a false alarm. In another case, the object may not have been detectable within the detection range of the other type of sensor. For example, a microphone may detect an approaching motorcycle but, due to fog, the camera may not detect the same motorcycle. In yet another case, the other type of sensor may have to be retrained.
In accordance with an exemplary embodiment,
Each of the sensors 105 provides data to a controller 110 which performs detection according to an exemplary architecture. As noted, the exemplary sensors 105 and detection architecture discussed with reference to
As previously noted, the controller 110 performs detection based on data from each of the sensors 105, according to the exemplary architecture. The controller 110 includes processing circuitry that may include an application specific integrated circuit (ASIC), an electronic circuit, a processor (shared, dedicated, or group) and memory that executes one or more software or firmware programs, a combinational logic circuit, and/or other suitable components that provide the described functionality. The controller 110 may communicate with an electronic control unit (ECU) 130 that communicates with various vehicle systems 140 or may directly control the vehicle systems 140 based on the detection information obtained from the sensors 105. The controller 110 may also communicate with an infotainment system 145 or other system that facilitates the display of messages to the driver of the automobile 101.
At block 310, obtaining detection based on the microphone 125 refers to the fact that data collected with the microphone 125 indicates the object 150, the other vehicle 100, in lane 220. The detection may be performed by the controller 110 according to an exemplary embodiment. In the scenario shown in
At block 320, a check is done of whether the camera 115 also sees the object 150. Specifically, processing of images obtained by the camera 115 at the controller 110 may be used to determine if the object 150 is detected by the camera 115. If the camera 115 does see the object 150, then augmenting or automating an action, at block 330, refers to alerting the driver to the presence of the object 150 or automatically preventing the lane change. The alert to the driver may be provided on a display of the infotainment system 145 or, alternately or additionally, via other visual (e.g., lights) or audible indicators. If the camera 115 does not also detect the object 150 that was detected by the microphone 125, then performing online learning, at block 340, refers to real-time adjustments to the detection algorithm associated with the camera 115 data. For example, the detection threshold may be reduced by a specified amount.
At block 350, another check is done of whether the camera 115 detects the object 150 that the microphone 125 detected. This check determines whether the online learning, at block 340, changed the result of the check at block 320. If the online learning, at block 340, did change the result such that the check at block 350 determines that the camera 115 detects the object 150, then the process of augmenting or automating the action is performed at block 330.
If the online learning, at block 340, did not change the result such that the check at block 350 indicates that the camera 115 still does not detect the object 150, then logging the current scenario, at block 360, refers to recording the data from the camera 115 and the microphone 125 along with timestamps. The timestamps facilitate analyzing data from different sensors 105 at corresponding times. Other information available to the controller 110 may also be recorded. Once the information is logged, at block 360, the process of augmenting or automating action, at block 330, may optionally be performed. That is, a default may be established for the situation in which both sensors 105 (e.g., camera 115 and microphone 125) do not detect the object 150 in their common field of detection, even after online learning, at block 340. This default may be to perform the augmentation or automation, at block 330, based only one sensor 105 or may be to perform no action unless both (or all) sensors 105 detect an object 150.
At block 370, the processes include performing offline analysis, which is detailed with reference to
While the scenario depicted in
Based on the analysis at block 410, a false alarm indication, at block 420, refers to determining that the sensor 105 that resulted in the detection was wrong. In the exemplary case discussed with reference to
Based on the analysis at block 410, an indication may be provided, at block 430, that the sensor 105 was fully blocked. In the exemplary case discussed with reference to
Based on the analysis at block 410, an indication may be provided, at block 440, that the sensor 105 was partially blocked. In the exemplary case discussed with reference to
Based on the analysis at block 410, an indication may be provided, at block 450, that re-training of the sensor 105 is needed. In the exemplary case discussed with reference to
Producing output 1 and output 2, at blocks 530-1 and 530-2, respectively, refers to obtaining a dot product between each of the three matrices of the image and the corresponding one of the three matrices of each filter. When the image matrices have more elements than the filter matrices, multiple dot product values are obtained using a moving window scheme whereby the filter matrix operates on a portion of the corresponding image matrix at a time. The output matrices indicate classification (e.g., target (1) or no target (0)). This is further discussed with reference to
The processes include comparing output 1, obtained at block 530-1, with ground truth, at block 540-1, and comparing output 2, obtained at block 530-2, with ground truth, at block 540-2. The comparing refers to comparing the classification indicated by output 1, at block 540-1, and the classification indicated by output 2, at block 540-2, with the classification indicated by the fused sensor 105, the microphone 125 in the example discussed herein. That is, according to the exemplary case, obtaining a detection based on the microphone 125, at block 310, refers to the classification obtained by processing data from the microphone 125 indicating a target (1).
If the comparisons, at blocks 540-1 and 540-2, show that the classifications obtained with the images and current filters match the classification obtained with the microphone 125, then the next logged image is processed according to
The dot product for the fifth position of the filter matrices 620-r, 620-g, and 620-b over the corresponding matrices 610-r, 610-g, and 610-b is indicated and the computation is shown for matrix 610-r and filter matrix 620-r. The fifth element of the output matrix 630 is the sum of the three dot products shown for the three filter matrices 620-r, 620-g, and 620-b (i.e., 2+0+(−4)=−2). Once the three dot products are obtained for each of the nine positions of the filter matrices 620-r, 620-g, and 620-b and the output matrix 630 is filled in, the output matrix 630 is used to obtain the classification (e.g., target (1) or no target (0)) based on additional processes. These additional processes include a known fully connected layer, in addition to the above-discussed convolution and pooling layers.
As previously noted, filter 1 may be associated with detection of a vehicle 100 (e.g., 150a,
As
While the above disclosure has been described with reference to exemplary embodiments, it will be understood by those skilled in the art that various changes may be made and equivalents may be substituted for elements thereof without departing from its scope. In addition, many modifications may be made to adapt a particular situation or material to the teachings of the disclosure without departing from the essential scope thereof. Therefore, it is intended that the present disclosure not be limited to the particular embodiments disclosed, but will include all embodiments falling within the scope thereof.