This disclosure generally relates to a road-model-definition system, and more particularly relates to a system that determines a transformation used to map a lane-marking present in an image from a camera onto a travel-surface model that is based on lidar data to obtain a 3D marking-model of the lane-marking.
An accurate model of the upcoming travel-surface (e.g. a roadway) in front of a host-vehicle is needed for good performance of various systems used in automated vehicles including, for example, an autonomous vehicle. It is known to model lane-markings of a travel-surface under the assumption that the travel-surface is planar, i.e. flat and level. However, the travel-surface is actually frequently crowned, meaning that the elevation of the travel-surface decreases toward the road-edge which is good for drainage during rainy conditions. Also, there is frequently a vertical curvature component which is related to the change in pitch angle of the travel-surface (e.g. turning up-hill or down-hill) as the host-vehicle moves along the travel-surface. Under these non-planar conditions, lane-marking estimates from a vision system that assumes a planar travel-surface leads to an inadequately accurate road-model. The travel-surface may also be banked, or inclined, for higher speed turns such as freeway exits, which while planar, is important for vehicle control.
Described herein is a road-model-definition system that uses an improved technique for obtaining a three-dimensional (3D) model of a travel-lane using a lidar and a camera. The 3D road-model incorporates the components of crown and vertical curvature of a travel-surface, along with vertical and/or horizontal curvature of a lane-marking detected in an image from the camera. The 3D model permits more accurate estimation of pertinent features of the environment, e.g. the position of preceding vehicles relative to the travel-lane, and more accurate control of a host-vehicle in an automated driving setting, e.g. better informing the steering controller of the 3D shape of the travel-lane.
In accordance with one embodiment, a road-model-definition system suitable for an automated-vehicle is provided. The system includes a camera, a lidar-unit, and a controller. The camera used is to provide an image of an area proximate to a host-vehicle. The lidar-unit is used to provide a point-cloud descriptive of the area. The controller is in communication with the camera and the lidar-unit. The controller is configured to determine an image-position of a lane-marking in the image, select ground-points from the point-cloud indicative of a travel-surface, determine coefficients of a three-dimensional (3D) road-model based on the ground-points, and determine a transformation to map the lane-marking in the image onto the travel-surface based on the image-position of a lane-marking and the 3D road-model and thereby obtain a 3D marking-model.
Further features and advantages will appear more clearly on a reading of the following detailed description of the preferred embodiment, which is given by way of non-limiting example only and with reference to the accompanying drawings.
The present invention will now be described, by way of example with reference to the accompanying drawings, in which:
The system 10 includes, but is not limited to, a camera 14 used to provide an image 16 of an area 18 proximate to a host-vehicle 12, and a lidar-unit 20 used to provide a point-cloud 22 (i.e. a collection of coordinates of lidar detected points, as will be recognized by those in the art) descriptive of the area 18. While the camera 14 and the lidar-unit 20 are illustrated in a way that suggests that they are co-located, possibly in a single integrated unit, this is not a requirement. It is recognized that co-location would simplify aligning the image 16 and the point-cloud 22. However, several techniques are known for making such an alignment of data when the camera 14 and the lidar-unit 20 are located on the host-vehicle 12 at spaced-apart locations. It also not a requirement that the fields-of-view of the camera 14 and the lidar-unit 20 are identical. That is, for example, the camera 14 may have a wider-field-of-view than the lidar-unit, but both fields-of-view include or cover the area 18, which may be generally characterized as forward of the host-vehicle 12.
The system 10 includes a controller 24 in communication with the camera 14 and the lidar-unit 20. The controller 24 may include a processor (not specifically shown) such as a microprocessor and/or other control circuitry such as analog and/or digital control circuitry including an application specific integrated circuit (ASIC) for processing data as should be evident to those in the art. The controller 24 may include memory (not specifically shown), including non-volatile memory, such as electrically erasable programmable read-only memory (EEPROM) for storing one or more routines, thresholds, and captured data. The one or more routines may be executed by the processor to perform steps for determining a 3D model of the area 18 about the host-vehicle 12 based on signals received by the controller 24 from the camera 14 and the lidar-unit 20, as described in more detail elsewhere herein.
The controller 24 is configured to determine an image-position 28 of a lane-marking 26 in the image 16. By way of example and not limitation, it may be convenient to indicate the image-position 28 of the lane-marking 26 by forming a list of pixels or coordinates where the lane-marking 26 is detected in the image 16. The list may be organized in terms of the i-th instance where, for each lane marker, the lane-marking 26 is detected and that list may be described by
L(i)=[u(i), v(i)]for i=1:N Eq. 1,
where u(i) is the vertical-coordinate and v(i) is the horizontal-coordinate for the i-th instance of coordinates indicative of the image-position 28 of the lane-marking 26 being detected in the image 16. If the camera 14 has a resolution of 1024 by 768 for a total of 786,432 pixels, the number of entries in the list may be unnecessarily large, i.e. the resolution may be unnecessarily fine. As one alternative, a common approach is to list the pixel position (u, v) of the center or mid-line of the detected lane marker, so the list would determine or form a line along the center or middle of the lane-marking 26. As another alternative, the pixels may be grouped into, for example, twelve pixels per pixel-group (e.g. four×three pixels), so the number of possible entries is 65,536 which for reduced capability instances of the controller 24 may be more manageable. It should be apparent that typically the lane-marking 26 occupies only a small fraction of the image 16. As such the actual number of pixels or pixel-groups where the lane-marking is detected will be much less than the total number of pixels or pixel-groups in the image 16. An i-th pixel or pixel-group may be designated as indicative of, or overlying, or corresponding to the lane-marking 26 when, for example, half or more than half of the pixels in a pixel-group indicates the presence of the lane-marking 26.
It has been observed that the difference between the true 3-D positions of lane-markers on a roadway and an assumed position that is based on a flat, zero height, ground plane can be significant in practice due to vertical-curvature (e.g. the roadway bending up-hill or down-hill), and/or horizontal curvature or inclination (road crowning, high speed exit ramps, etc.), which can lead to compromises with respect to precise control of the host-vehicle 12. For example, note that the lines of the projected-marker 34 are illustrated as diverging as the longitude value increases. This is because the actual roadway where the lane-marker actually resides is bending upward, i.e. has positive vertical-curvature. As such, when the imaged-marker 32 is projected onto the zero-height-plane 36 because a flat road is assumed, the lack of compensation for vertical-curvature causes the projected-marker 34 to diverge.
The projection of the imaged-marker 32 onto a 3D model of the roadway produces the modeled-marker 38, which may be performed by assuming an idealized pin-hole model for the camera 14, without assuming a loss of generality,. Assuming that the camera 14 is located at Cartesian coordinates (x, y, z) of [0, 0, hc], where hc is the height of the camera 14 above the travel-surface 42, and the camera 14 has a focal length of f, the pin-hole camera model projects a i-th pixel or pixel-group from (x, y, z) in relative world coordinates to (u, v) in image coordinates using
u(i)=f*{z(i)−hc}/x(i) Eq. 2,
and
v(i)=f*y(i)/x(i) Eq. 3.
Once the ground-points 44 that define the travel-surface 42 are defined, the controller 24 is configured to determine the 3D road-model 40 of the travel-surface 42 based on the ground-points 44. Given a set of M lidar ground measurements, where r(k), φ(k), and θ(k) are the range, elevation angle and azimuth angle respectively of the kth lidar measurement, a road surface model can be fit to the measurements. Typical models include: plane, bi-linear, quadratic, bi-quadratic, cubic, and bi-cubic and may further be tessellated patches of such models. While a number of surface-functions are available to base the 3D road-model 40 upon, analysis suggests that it may be preferable if the 3D road-model corresponds to the biquadratic model 50 of the ground-points 44, which may be represented by
x(k)=r(k)*cos [φ(k)]*cos [θ(k)]
y(k)=r(k)*cos [φ(k)]*sin [θ(k)]
z(k)=r(k)*sin [φ(k)]+hl Eq. 4,
where ‘hl’ is the height of the lidar-unit 20 above the zero-height-plane 36. z(k) is determined using a biquadratic model
z(k)=a1+a2*x(k)+a3*y(k)+a4*x(k)̂2+a5*y(k)̂2+a6*x(k)*y(k)+a7*x(k)̂2*y(k)+a8*x(k)*y(k)̂2+a9*x(k)̂2*y(k)̂2 Eq. 5,
where a9 is assumed to be zero (a9=0) in order to simplify the model. The 3D road-model 40 is then determined by an estimated set of coefficients, a, for the model that best fits the measured data. A direct least squares solution is then given by:
Now given the lane marker position of the imaged-marker 32 in the camera image plane, and the 3D road-model 40 of the travel-surface 42, the two can be fused together to get the estimated 3-D positions of the lane marker point. This can be done by solving a non-linear equation of the camera projection model constrained by the 3D road-model 40. The preferred embodiment of the road model is the biquadratic model 50, given by equations Eq. 4, Eq. 5, and Eq. 6. In Eq. 7 below, the coefficients âi are the coefficients that were estimated by solving Eq. 6. The points (uk, vk) are the image plane, pixel, positions of detected lane markers. The corresponding world coordinates of the lane marker detections are then solved for x(k), y(k), z(k), i.e. (xk, yk, zk), where
The solution to this system of equations is a cubic-polynomial equation. This equation can be solved with a closed form solution for cubic-polynomials, by root finding methods, or by optimization techniques such as the secant method
((a7f2vk+a8fvk2)zk3+(−3a7hCf2vk+a4ukf2−3a8hCfvk2+a6ukfvk+a5ukvk2)zk2+(3a7f2hC2vk−2a44f2hCuk+a2fuk2−2a5hCukvk2−uk3+a3uk2vk)zk+(−a7f2hC3vk+a4f2hC2uk−a8fhC3vk2+a6fhC2ukvk−a2fhCuk2+a5hC2ukvk2−a3hCuk2vk+a1uk3))=0 Eq. 8.
Solving Eq. 8 for zk, xk and yk can then be solved to provide the fused/reconstructed 3-D positions of the lane markers
Referring again to
Given the positions of the projected-marker 34 from Eq. 8 and Eq. 9, the point for each lane marker can then be converted to a more compact representation of the curves. In a preferred embodiment each lane marker is represented with two 2-D cubic-polynomials 52 that independently model the horizontal and vertical curvatures. That is, the 3D road-model 40 characterizes the lane-marking 26 using two 2D cubic-polynomials 52 that are based on a horizontal-curvature 54 and a vertical-curvature 56 of the lane-marking 26. For example, given the 3-D reconstructed point of the left lane marker {xk, yk, zk}left, then the horizontal-curvature is represented by
and the vertical-curvature is represented by
That is, the controller 24 is configured to determine a transformation 60 that maps the lane-marking 26 in the image 16 onto the travel-surface 42 based on the image-position 28 of a lane-marking 26 and the 3D road-model 40, and thereby obtain the 3D lane model 58.
Accordingly, a road-model-definition system (the system 10), a controller 24 for the system 10, and a method of operating the system 10 is provided. The system 10 provides for the fusing of an image 16 of a lane-marking 26 with a 3D road-model 40 of a travel-surface 42 to provide a 3D lane-model of the lane-marking 26 so that any substantive error caused by a horizontal-curvature 54 and/or a vertical-curvature 56 of the travel-surface 42 is accounted for rather than assume that the travel-surface 42 is flat.
While this invention has been described in terms of the preferred embodiments thereof, it is not intended to be so limited, but rather only to the extent set forth in the claims that follow.