An embodiment relates to augmented curb and bump detection.
Advanced Driving Assistance Systems (ADAS) is often viewed as an intermediate stage before reaching full autonomous driving. ADAS functionality integrates various active safety features. A goal is to alert the driver of possible danger and prevent a collision whether it is a pedestrian, another vehicle, or object (e.g., pedestrian/vehicle detection, lane departure warming, and lane keeping assist). Many advanced systems of the vehicle (e.g., automatic cruise control, lane change demand, parking assist) take partial control of the vehicle such as autonomously modifying the speed and/or steering while taking into account the surrounding environment.
Curb detection contributes to accurate vehicle positioning in urban areas. The detection of curbs in front/rear of the vehicle is crucial for applications such as parking assist/autonomous parking. The accurate localization and range estimation of the curb is passed to the vehicle control system, which in turn smoothly maneuvers the vehicle so as to avoid possible shock with the front/rear curb or to avoid curbs around a curvature in the road.
The challenge in extracting curbs and bump from images lies in their small size (e.g., approximately 10-20 cm high). While the curbs three dimensional (3D) shape is pretty standard, it would be possible to model the curb as a two dimensional (2D) step-function. Consequently, most current approaches rely on active sensor (Lidar) or stereo-camera, which make it possible to directly extract 3D information which assists in simplifying the processing. Such techniques often use various road marks (e.g., soft shoulder, curbs, and guardrails, based on strong prior knowledge about road scenes).
While some systems rely on various techniques such as deep learning, hierarchical probabilistic graphical model, integrate prior knowledge, or exploit multi-sensor fusion, none of these general frameworks deal explicitly with curb detection because of their small size and its non-distinctive pattern require a dedicated approach to be successful.
An advantage of an embodiment is the use of a single monocular image capture device to capture and image for detecting curbs and other similar structures. The first stage is directed at detecting and localizing candidate curb regions in the image using a machine learning classifier combined with a robust image appearance descriptor and sliding window strategy. A Histogram of Gradient selected as an image descriptor. The sliding window classifier returns a score per window, indicating the region where the curb is most likely to appear in the image. Parallel edge lines are extracted localize curb borders with a candidate region. The result is a pair of curves that delineate the curb in the image. The second stage computes the geometry of the candidate curb. Given a few assumptions about the scene and the camera such as the road surface being flat, a camera-to-curb or camera-to-bump distance can be estimated from the image. For curb detection, it is further assumed that the curb is substantially orthogonal to the ground plane, and curb's height can be estimated. Lastly, temporal information is applied to filter out false detection. Candidate curb lines detected in the current frame will appear in the next frame at a position determined by the only camera motion. A simple task can then be applied to define a Kalman filter (or any other tracking method) to track the pair of lines over time and potentially to remove pair of lines which are inconsistent over time.
An embodiment contemplates a method of detecting a curb. An image of a path of travel is captured by a monocular image capture device mounted to a vehicle. A feature extraction technique is applied by a processor to the captured image. A classifier is applied to the extracted features to identify a candidate region in the image. Curb edges are localized by the processor in the candidate region of the image by extracting edge points. Candidate curbs are identified as a function of the extracted edge points. A pair of parallel curves is selected representing the candidate curb. A range from image capture device to the candidate curb is determined. A height of the candidate curb is determined. A vehicle application is enabled to assist a driver in maneuvering a vehicle utilizing the determined range and depth of the candidate curb.
The following detailed description is meant to be illustrative in understanding the subject matter of the embodiments and is not intended to limit the embodiments of the subject matter or the application and the uses of such embodiments. Any use of the word “exemplary” is intended to be interpreted as “serving as an example, instance, or illustration.” Implementations set forth herein are exemplary and are not meant to be construed as preferred or advantageous over other implementations. The descriptions herein are not meant to be bound by any expressed or implied theory presented in the preceding background, detailed description or descriptions, brief summary or the following detailed description.
Techniques and technologies may be described herein in terms of functional and/or logical block components, and with reference to symbolic representations of operations, processing tasks, and functions that may be performed by various computing components or devices. Such operations, tasks, and functions are sometimes referred to as being computer-executed, computerized, software-implemented, or computer-implemented. It should be appreciated that the various block components shown in the figures may be realized by any number of hardware, software, and/or firmware components configured to perform the specified functions. For example, an embodiment of a system or a component may employ various integrated circuit components, (e.g., memory elements, digital signal processing elements, logic elements, look-up tables, or the like, which may carry out a variety of functions under the control of one or more microprocessors or other control devices).
When implemented in software, various elements of the systems described herein are essentially the code segments or computer-executable instructions that perform the various tasks. In certain embodiments, the program or code segments are stored in a tangible processor-readable medium, which may include any medium that can store or transfer information. Examples of a non-transitory and processor-readable medium include an electronic circuit, a microcontroller, an application-specific integrated circuit (ASIC), a semiconductor memory device, a ROM, a flash memory, an erasable ROM (EROM), a floppy diskette, a CD-ROM, an optical disk, a hard disk, or the like.
The system and methodology described herein can be utilized to identify curbs and other like structures for driver awareness assist systems, semi-autonomous driving systems, or autonomous driving systems. While the approach and methodology are described below with respect to vehicle applications, one of ordinary skill in the art appreciates that an automotive application is merely exemplary, and that the concepts disclosed herein may also be applied to any other suitable systems and boundary detections that include, but are not limited to, manufacturing facilities with autonomously driven vehicle or robots or general industrial applications.
The term “vehicle” as described herein can be construed broadly to include not only a passenger automobile, but any other vehicle including, but not limited to, wheelchairs, rail systems, planes, off-road sport vehicles, robotic vehicles, motorcycles, trucks, sports utility vehicles (SUVs), recreational vehicles (RVs), marine vessels, aircraft, farming vehicles, and construction vehicles.
There is shown in
The curb detection system is equipped with at least one monocular image capture device mounted to the vehicle 12. A first monocular image capture device 14 may include an image capture device mounted to a front of the vehicle. A second monocular image capture device 16 may be mounted to a rear of the vehicle 12. The first image capture device 14 and the second image capture device 16 is in communication with a processing unit 18 for receiving processing images captured by the image capture devices for curb detection.
The processor 18 is coupled to the first monocular image capture device 14 and the second monocular image capture device 16. Alternatively, the processor 18 may be a shared processor of another device. The processor 18 identifies the curbs based on the techniques described herein. A memory 20 may be used to store data obtained by the monocular image capture devices. Moreover, the memory 20 may store other types of data that is used by the processor 18 during curb detection analysis.
The vision based curb detection system may include output devices 22 that include vehicle applications that include, but are not limited to, collision avoidance applications, clear path detection applications, object detection applications, and vehicle motion applications, autonomous vehicle navigation systems. The vision based curb detection system may further include display devices 24 for assisting the driver in enhancing such curbs displayed on the display device 24.
The respective technique utilizes a model that identifies a road curb using both visual cues and geometric characteristics of images obtained from the monocular camera. In addition, temporal information may be utilized by exploiting observation's redundancy between consecutive frames. The main underlying assumption of the model is that the road surface is flat and that the curb is approximately orthogonal to the road plane.
In step 30, an image is captured at a first instance of time by one of the monocular image capture devices. The monocular image capture device is focused in a direction that the vehicle is traveling.
In step 31, the input image is input to a processor for processing. Feature extraction is performed to detect a curb in the image. An exemplary process includes a Histogram of Gradient (HOG) which is used as a feature descriptor to process the image for the purpose of object detection. The image is divided into small connected regions (e.g., patches), and for those pixels within each of the regions, a histogram of gradient directions is compiled. An advantage of utilizing HOG is that the descriptor operates on local patches and is invariant to photometric transformations. HOG is extracted at two scales using a two-level pyramid representation of each image. Examples of curbs and non-curb patches of HOG features are depicted in
In step 32, the extracted descriptor is input to a classifier including, but not limited to, a binary linear support vector machine (SVM), to classify each patch. The SVM model is learned from a set of positive and negative patches samples (i.e., patch including or not including curbs), extracted from a training dataset. At testing stage, the SVM is applied and curb candidate patches are detected. Overlapping candidate patches accumulate votes for a curb/non-curb category. A single connected candidate region in each frame is then extracted by applying a threshold on the voting score of each patch.
Once candidate regions are determined, localization of curb edges is determined. Borders of the curb (e.g., the top and bottom discontinuities of the 2D step function) enable to the routine to delineate the exact area of the curb. To localize these borders, in step 33, standard edge points/lines algorithm is applied within the candidate region, after the image is smoothed with a Gaussian filter. An example of extracting edge points is shown in
In step 34, based on the strong edge points, up to five curves c1, c2, c3, c4, c5 (i.e., second order polynomials) are fit using a sequential Ransac algorithm as shown in
In step 35, parallel curves that have a distant spacing (e.g., 10-25 cm) within a predetermined range are retained. Among the possible extracted curves, a most probable pair is selected based on a simple heuristic, and each of the curves of the pair are labeled either a bottom edge or top edge, so as to form couples of points {(pib,pil)}i=1 . . . N. An example is shown in
As the parallel curbs are retained that represent an upper and lower boundary of the curb, the geometry of the curb is determined.
In step 36, a range from the vehicle to the curb is estimated. It should be understood that the following description hereinafter regarding the height detection applies to curbs, not bumps as different assumptions are made for bumps. If the camera is equipped with fisheye lens, and given the camera intrinsic parameters estimated from camera calibration, a point on the distorted image p=(u,v) is mapped onto its unit-norm bearing vector b=({dot over (u)},{dot over (v)},{dot over (w)}), where {dot over (u)}, {dot over (v)} and {dot over (w)} are the coordinates of the bearing vector along X,Y,Z camera reference axis as depicted in
where {dot over (w)}g is the coordinate along axis Z of the bearing vector in the camera-center reference system and h is the camera height. Eq. 1 can be generalized for more general cases (non parallelism between camera axis and road plane).
If the camera is equipped with a rectilinear lens, the camera-to-curb distance can be expressed as:
where f is the camera focal length provided by camera intrinsic parameter calibration.
In step 37, if the curb plane is assumed to be orthogonal to the road surface, points of curb's bottom-border detected in the image, {pib} are also located on the ground plane in the scene, the distance Dib is first computed from these points. Using assumption v) (Di=Dib=Dit), a height of the curb (Δh) for each pair of points (pib,pil) can be determined using the following equation, in the case of fisheye model:
where Di is the estimated depth from eq 1, {dot over (w)}it and {dot over (u)}it are the coordinates of the bearing vector of pil and h is the camera height. Similarly, in rectilinear model, the expression becomes:
In step 38, temporal filtering is applied as a post-processing step to eliminate possible false-positives or to recover false negatives due to possible erroneous detection or localization. An image captured at a next time frame, in step 39, is input to the temporal filter. A tracking filter including, but not limited to, a Kalman filter, may be applied to the edge points to adjust/correct a detection of false positives such as intermittent failure.
In step 40, detected frames are provided to an output device. Annotated frames indicating a presence of a curbs and attributes may be displayed on a display device to the driver. In addition, vehicle applications such as autonomous parking or lane centering may utilize the information to autonomously guide the vehicle in relation to the detected curb.
While certain embodiments of the present invention have been described in detail, those familiar with the art to which this invention relates will recognize various alternative designs and embodiments for practicing the invention as defined by the following claims.
Number | Name | Date | Kind |
---|---|---|---|
8099213 | Zhang | Jan 2012 | B2 |
8537338 | Medasani | Sep 2013 | B1 |
9285230 | Silver | Mar 2016 | B1 |
20100097457 | Zhang | Apr 2010 | A1 |
20100204866 | Moshchuk | Aug 2010 | A1 |
20130058528 | Liu | Mar 2013 | A1 |
20130266175 | Zhang | Oct 2013 | A1 |
20150332114 | Springer | Nov 2015 | A1 |
20160147915 | Pope | May 2016 | A1 |
Entry |
---|
Masmoudi, Imen, et al. “Vision based system for vacant parking lot detection: Vpld.” Computer Vision Theory and Applications (VISAPP), 2014 International Conference on. vol. 2. IEEE, 2014. |
Number | Date | Country | |
---|---|---|---|
20170344836 A1 | Nov 2017 | US |