Autonomous-controlled vehicles rely heavily on computer vision capabilities developed through machine learning methods. For example, an onboard controller of an autonomous vehicle may use computer vision capabilities to accurately estimate the roadway and its surrounding drive environment. Using a specialized suite of onboard sensors, the vehicle controller is able to estimate the road surface for path planning and route execution, as well as potential obstacles such as other vehicles, pedestrians, curbs, sidewalks, trees, and buildings. The controller, upon receiving image data and other information from the onboard sensor suite, may apply machine learning techniques to estimate the roadway and drive environs, with this information thereafter used to control a drive event.
In general, the image data collected by the onboard sensor suite includes pixel data corresponding to drivable surface area or free space. Free space in a given image is typically estimated as a binary segmentation of the collected image, with image segmentation techniques being performed to separate the drivable surface area from the surface area of non-drivable surfaces. The use of color video alone for the purpose of detecting free space is suboptimal for various reasons. For instance, a paved road surface will often use similar paving materials and colors as other structures or features in the image, such as a curb or a sidewalk. As a result, one surface is often easily confused for another, which in turn may adversely affect performance of onboard free space estimation and path planning functions.
The solutions described herein are collectively directed toward improving the overall drive experience of a host vehicle using combined red, green, blue (“RBG”)-polarimetric data, with the host vehicle exemplified herein as an autonomously controlled motor vehicle. Other ground-based mobile platforms requiring the accurate detection of free space may also benefit from the present teachings, and therefore the disclosure is not limited to motor vehicles.
Free space detection functions performed as described herein involve locating and identifying drivable surface areas within an image frame in relation to the host vehicle, and thus is an essential input to automated path planning and decision making. As noted above, RGB data alone is suboptimal when attempting to identify free space in a collected image. While lidar is a possible data source for acquiring identifying geometric information for the purpose of distinguishing a drivable surface from other surfaces or objects in an image, the incorporation of lidar sensors into the architecture of the host vehicle is a relatively expensive proposition. The present approach addresses this potential problem using combined RGB-polarimetric data as set forth in detail below.
In particular, an aspect of the disclosure includes a free space estimation and visualization system for use with a host vehicle. An embodiment of the system includes a camera and an electronic control unit (“ECU”). The camera is configured to collect RGB-polarimetric image data of drive environs of the host vehicle, including a potential driving path thereof. The ECU, which is in communication with the camera, is configured to receive the RGB-polarimetric image data from the camera and estimate an amount of free space in the potential driving path as estimated free space. This action includes processing the RGB-polarimetric image data via a run-time neural network. The ECU then executes a control action aboard the host vehicle in response to the estimated free space.
In one or more embodiments, the ECU calculates a feature set using the RGB-polarimetric image data, and then communicates the feature set to the run-time neural network as an input data set. The input data set in turn is characterized by an absence of lidar data.
The feature set may have six set elements determined as a concatenation of the RGB data, angle of linear polarization (“AoLP”) data, and degree of linear polarization (“DoLP”) data from the camera. For instance, the six set elements could include sin(2·AoLP), cos(2·AoLP), 2·DolP−1, 2·R−1, 2·G−1, and 2·B−1.
The host vehicle in a possible implementation is a motor vehicle having a vehicle body, in which case the camera is connected to the vehicle body.
The ECU in one or more embodiments of the disclosure is in communication with a path planning control module of the host vehicle. In such an implementation, the path planning control module is configured to plan a drive path of the host vehicle as at least part of the control action.
An aspect of the disclosure includes the ECU being in communication with a display screen and configured to display a graphical representation of the estimated free space on the display screen.
A method is also disclosed herein for estimating free space aboard a host vehicle. The method in accordance with one or more embodiments includes collecting RGB data and lidar data of a target drive scene using an RGB camera and a lidar camera, respectively, and then generating, via a first neural network of a training computer, pseudo-labels as a ground truth of the target drive scene. The method also includes collecting RGB-polarimetric data via a camera, training a second neural network via the training computer using the RGB-polarimetric data and the pseudo-labels, and using the second neural network in an ECU of the host vehicle as a run-time neural network to estimate an amount of free space in a potential driving path of the host vehicle as estimated free space.
A host vehicle is also disclosed herein having a vehicle body, road wheels, a camera, and an ECU. The camera is configured to collect RGB-polarimetric image data of drive environs of the host vehicle, including a potential driving path thereof. The ECU is configured to receive the RGB-polarimetric image data from the camera, estimate an amount of free space in the potential driving path as estimated free space, including processing the RGB-polarimetric image data via a run-time neural network, and execute a control action aboard the host vehicle in response to the estimated free space.
The above features and advantages, and other features and advantages, of the present teachings are readily apparent from the following detailed description of some of the best modes and other embodiments for carrying out the present teachings, as defined in the appended claims, when taken in connection with the accompanying drawings.
The accompanying drawings, which are incorporated into and constitute a part of this specification, illustrate implementations of the disclosure and together with the description, serve to explain the principles of the disclosure.
The appended drawings are not necessarily to scale, and may present a simplified representation of various preferred features of the present disclosure as disclosed herein, including, for example, specific dimensions, orientations, locations, and shapes. Details associated with such features will be determined in part by the particular intended application and use environment.
The components of the disclosed embodiments may be arranged in a variety of configurations. Thus, the following detailed description is not intended to limit the scope of the disclosure as claimed, but is merely representative of possible embodiments thereof. In addition, while numerous specific details are set forth in the following description to provide a thorough understanding of various representative embodiments, embodiments may be capable of being practiced without some of the disclosed details. Moreover, in order to improve clarity, certain technical material understood in the related art has not been described in detail. Furthermore, the disclosure as illustrated and described herein may be practiced in the absence of an element that is not specifically disclosed herein.
The present automated solutions are operable for detecting free space drive environs or drive scene of an automated host vehicle. The scene is ascertained using combined visible spectrum/red-green-blue (“RGB”)-polarimetric image data, deep learning, and multi-modal data as described in detail herein. In general, the technical solutions presented herein utilize two neural networks in different capacities: (1) an offline/training model, and (2) an online/run-time model. The training neural network, referred to herein as first neural network NN1 for clarity, is used offline to generate pseudo-labels on a training dataset, with the training dataset including lidar data, RGB data, and polarimetric data. Inputs to the first neural network (NN1), however, are limited to RGB data and lidar data.
The pseudo-labels generated by the training model, i.e., neural network NN1, are then used, still offline, to train a second neural network (NN2). The second neural network (NN2) for its part receives RGB-polarimetric data as its data input. That is, unlike the first neural network (NN1) used solely offline for training purposes, the second neural network (NN2) is also used online, i.e., aboard the host vehicle during a drive operation. In doing so, the second neural network (NN2) does not receive or use lidar data to infer free space as part of the present strategy. As a result, the host vehicle as contemplated herein is characterized by an absence of a lidar sensor.
Referring to
The host vehicle 10 is equipped with an electronic control unit (“ECU”) 50. The ECU 50 in turn is used as part of a free space estimation and visualization (“FSEV”) system 11, a representative example embodiment of which is depicted in
Drive environs that are estimated and visualized as set forth herein may encompass drivable surfaces in proximity to the host vehicle 10, including paved, semi-paved, or unpaved roads, driveways, and parking lots, as well as excluded/non-drivable surfaces such as sidewalks, curbs, buildings, bodies of water, forests, and fields. More specifically, the ECU 50 is configured to use RBG-polarimetric data for the purpose of identifying free space in such drive environs, with an ultimate goal of improving the accuracy of drive path planning processes while reducing hardware costs associated with this task.
Further with respect to the exemplary host vehicle 10, the vehicle body 12 is connected to one or more road wheels 16, with a typical four wheel configuration shown in
The vehicle interior 14 of the host vehicle 10 may be equipped with one or more rows of vehicle seats 19, with two of the vehicle seats 19 illustrated in
The vehicle interior 14 is also equipped with various driver input devices, such as a steering wheel 25 and brake and accelerator pedals (not shown), etc. For the purpose of facilitating interaction of occupants of the host vehicle 10, the instrument panel 24 may be equipped with a center stack 26 having a display screen 260. In one or more embodiments, the host vehicle 10 may also be equipped with a heads-up display (“HUD”) 28, with the HUD 28 being configured for projecting information onto the windshield 22 as shown, or via a separate display screen (not shown) situated on the instrument panel 24. Either or both of the HUD 28 or the display screen 260 may ultimately display a graphical representation of the estimate free space, e.g., as a color view of the drive scene ahead of the host vehicle 10, with identified free space in the drive scene incorporated into the drive path planning function of the ECU 50.
Referring to
This multi-mode capability of the camera 20 is represented as a color pixel block 21 constructed of red (“R”), green (“G”), and blue (“B”) image pixels 210. Each image pixel 210 may have four or more constituent sub-pixels 210P, for a total of sixteen or more pixel calculation units, as appreciated in the art. Arrow IRGB represents the RGB color information contained in the 2D image data 23 as provided to the ECU 50 as part of the present strategy.
As noted above, the camera 20 is configured herein as an RGB-polarimetric imaging device. As such, the camera 20 also collects polarization data of the imaged drive scene, with the polarization data synchronized in time with the RGB data. In
As will be appreciated by those of ordinary skill in the art, polarimetry pertains to the measurement and interpretation of a polarization state of transverse waves, such as the light waves considered in the present application. Polarimetry is often used to study properties of interest in different materials, as well as the presence or absence of certain substances therein. For instance, ambient sunlight falling incident upon a road surface will reflect off of the surface to some extent. The ECU 50 thus determines the polarization state of the reflected portion of the incident sunlight and uses this information to inform decisions and control actions aboard the host vehicle 10 of
For example, the ECU 50 may use the polarization state to ascertain scene information, including the orientation and material properties of the surface, the viewing direction of the camera 20, the illumination direction of incident sunlight, etc. The polarization state in turn is measured by the camera 20 by passing reflected light through a set of polarizing filters (present on top of each of the subpixels 210P) of the camera 20, and thereafter measuring light intensity as the light is transmitted from the polarizing filter. The amount of transmitted light depends on the angle between the polarizing filter and the oscillation plane of the electrical field of incident light, and thus can be measured and used by associated processing hardware of the camera 20 to determine the polarization state.
In order to perform the disclosed estimation and visualization functions when identifying free space in the 2D image data 23 of
Various other hardware in communication with the ECU 50 may include, e.g., input/output circuit(s) and devices include analog/digital converters and related devices that monitor inputs from sensors, with such inputs monitored at a preset sampling frequency or in response to a triggering event. Software, firmware, programs, instructions, control routines, code, algorithms, and similar terms mean controller-executable instruction sets including calibrations and look-up tables. Each controller executes control routine(s) to provide desired functions. Non-transitory components of the memory 54 are capable of storing machine-readable instructions in the form of one or more software or firmware programs or routines, combinational logic circuit(s), input/output circuit(s) and devices, signal conditioning and buffer circuitry and other components that can be accessed by one or more processors 52 to provide a described functionality.
The FSEV system 11 for the host vehicle 10 therefore includes the camera 20 in addition to the ECU 50 and its processor 52 and memory 54. The camera 20 is configured to collect RGB-polarimetric image data of drive environs of the host vehicle 10 as noted above, including a potential driving path of the host vehicle 10. The ECU 50 as described above is configured to receive the RGB-polarimetric image data from the camera 20, and to estimate an amount of free space in the potential driving path as estimated free space, including processing the RGB-polarimetric image data via a run-time neural network. The ECU 50 then executes a control action aboard the host vehicle 10 in response to the estimated free space.
In terms of possible control actions, the integral or connected computer architecture of the ECU 50 shown in
The ECU 50 thus transmits output signals (arrow CCO) to the HMI 60 and the PPM 62 as part of this process, with the output signals (arrow CCO) including the identified free space in the 2D image data 23 of
OFFLINE TRAINING: turning now to
In general, the training block 40T of
This estimated free space information is provided as an input to the training process of the second neural network 42 as pseudo-labels (P-L), with the second neural network 42 receiving only color image data (arrow IRGB) and polarimetric data (IP) as inputs within the training block 40T. Such data could be provided by a single integrated sensor, e.g., the camera 20 of
The training stage of
Block B102 (“Model Training”) of
Block B103 (“Pseudo-Label Generation”) includes generating pseudo-labels (P-L) for use in training the second neural network 42 of
Pseudo-labeling as performed in block B103 starts by estimating free space on the data set D2a. This occurs via the first neural network 41. A small set of these estimations are then manually refined, and the first neural network 41 is fine-tuned on its own estimations and the refined labels to improve accuracy. This process is iterated several times until the automatic estimations of the first neural network 41 are good enough, e.g., relative to a predetermined objective standard. The estimations are then regarded as the pseudo-labels (P-L).
Block B104 (“Model Training”) of
RUN-TIME:
As a representation of the polarimetric data, the ECU 50 may use the AoLP, which is in the range (0°-180°), and the DoLP with its range (0-1). The feature representation used as an input to the second neural network 42 may be a concatenation of AoLP, DoLP, and the RGB data. Thus, a representative six-element feature set (F)=[F1, F2, F3, F4, F5, F6] could be calculated, where F1=sin(2·AoLP), F2=cos(2·AoLP), F3=2·DolP−1, F4=2·R−1, F5=2·G−1, and F6=2·B−1. The data in such a feature set (F) would fall in the range [−1, 1], and thus circular ambiguity in the AoLP is surmounted where 0° is equivalent to 180°.
The feature set (F) is communicated to the run-time neural network, i.e., the second neural network 42, as an input data set that is characterized by an absence of lidar data. That is, the second neural network 42 shown in
Referring now to
As set forth above, when incident light from the sun 71 reflects off of a surface, the reflected light has an identifiable polarized state that the ECU 50 of
When communicating with a driver or other passengers of the host vehicle 10 of
In general, therefore, a method for use with the FSEV system 11 for the host vehicle 10 includes collecting RGB and lidar data of a target drive scene using an RGB camera and a lidar camera, respectively, generating, via the first neural network 41, pseudo-labels (P-L) as a ground truth of the target drive scene, and collecting RGB-polarimetric data via the camera 20. The method also includes training the second neural network 42 using the RGB-polarimetric data and the pseudo-labels, using the second neural network 42 in the ECU 50 of the host vehicle 10 as a run-time neural network to estimate an amount of free space in a potential driving path of the host vehicle 10 as estimated free space (FS), including processing additional RGB-polarimetric image data via the second neural network 42, and executing a control action aboard the host vehicle 10 in response to the estimated free space (FS).
As will be appreciated by those skilled in the art having the benefit of the foregoing disclosure, in order to perform the method, the host vehicle 10 of
The ECU 50 thereafter estimates free space using the trained second neural network 42 using the RGB-polarimetric data as its input. Thus, the ECU 50 and the accompanying methodology as described above is characterized by an absence of the use of lidar data during run-time execution. As a result, production models of the host vehicle 10 of
The detailed description and the drawings or figures are supportive and descriptive of the present teachings, but the scope of the present teachings is defined solely by the claims. While some of the best modes and other embodiments for carrying out the present teachings have been described in detail, various alternative designs and embodiments exist for practicing the present teachings defined in the appended claims.