The present disclosure relates to a dimension estimating system and a method for estimating a dimension of a target vehicle.
Vehicles have on-board sensors to detect surrounding moving objects (e.g., vehicles, bicycles, trucks, etc.) in order to obtain positional, dimensional, and motion information regarding the detected objects. Such information may be shared with other remote vehicles via V2V and/or V2X communication.
However, there may be some difficulty in obtaining an accurate dimension and its center position of a moving object. For example, when the host vehicle visually captures a preceding vehicle by an on-board camera, the host vehicle is not able to obtain the length of the preceding vehicle since the camera only recognizes the rear view of the preceding vehicle. As a result, it would be difficult to obtain accurate dimensional information of an object surrounding the host vehicle.
In view of the above, it is an objective to provide a system and a method to obtain accurate dimensional information of a target vehicle surrounding a host vehicle.
This section provides a general summary of the disclosure, and is not a comprehensive disclosure of its full scope or all of its features.
A first aspect of the present disclosure provides a dimension estimating system for a host vehicle. The dimension estimating system includes an image sensor and a dimension estimator. The image sensor obtains image data of at least one side of a target vehicle around the host vehicle. The dimension estimator estimates, through a machine learning algorithm, a dimension of the target vehicle based on the image data of the one side of the target vehicle obtained by the image sensor.
A second aspect of the present disclosure provides a method for estimating a dimension of a target vehicle around a host vehicle. The method includes obtaining, with an image sensor, image data of at least one side of the target vehicle, and estimating, with a dimension estimator, a dimension of the target vehicle based on the image data of the one side of the target vehicle using a machine learning algorithm.
Further areas of applicability will become apparent from the description provided herein. The description and specific examples in this summary are intended for purposes of illustration only and are not intended to limit the scope of the present disclosure.
The drawings described herein are for illustrative purposes only of selected embodiments and not all possible implementations, and are not intended to limit the scope of the present disclosure. In the drawings:
In the following description, a dimension estimating system and a method for estimating a dimension of a target vehicle will be described. In the following embodiments, the dimension estimating system and the method will be employed together with a dedicated short range communications (DSRC) system. Then, the system and method will provide dimensional, positional, and motion information to the DSRC system, and the DSRC system will share the information with other vehicles. It should be noted that any type of Vehicle-to-Vehicle (V2V) and Vehicle-to-Infrastructure (V2X) communications which allow the host vehicle HV to communicate with other remote vehicle RVs and road infrastructures may be used for the present disclosure.
The DSRC antenna 22 and the DSRC radio 16 are mounted on, for example, a windshield or roof of the host vehicle HV. The DSRC radio 16 is configured to transmit/receive, through the DSRC antenna 22, messages to/from remote vehicle RVs and infrastructures (Road Side Units (RSU)) around the host vehicle HV. More specifically, the DSRC radio 16 transmits/receives successively basic safety messages (BSMs) to/from remote vehicle RVs equipped with similar DSRC systems over V2V (Vehicle-to-Vehicle) communications and/or V2X (Vehicle-to-Infrastructure) communications. In this embodiment, messages transmitted from the DSRC radio 16 include positional, dimensional, and motion information of target vehicles TV, as will be described in detail later.
The GPS antenna 24 is mounted on, for example, the windshield or roof of the host vehicle HV. The GPS 18 is connected to the GPS antenna 24 to receive positional information of the host vehicle HV from a GPS satellite (not shown). More specifically, the positional information includes a latitude and a longitude of the host vehicle HV. Furthermore, the positional information may contain the altitude and the time. The GPS 18 is connected to the processing unit 20 and transmits the positional information to the processing unit 20.
The first camera 12 is an on-board camera. In this embodiment, the first camera 12 is mounted on the windshield of the host vehicle HV to optically capture an image of a scene ahead of the host vehicle HV. Alternatively, the first camera 12 may be a rearview camera that optically captures an image of a rear scene. The first camera 12 is connected to the processing unit 20 using serial communication and transmits image data to the processing unit 20. The image data is used by the processing unit 20 to calculate i) distances to objects, such as remote vehicles (hereinafter, referred to as “target vehicles”) TV ahead of the host vehicle HV and ii) motion information of the target vehicles TV such as velocities, acceleration, and headings.
Similar to the first camera 12, the second camera 14 is an on-board camera. In this embodiment, the second camera 14 is mounted on the windshield of the host vehicle HV adjacent to the first camera 12. Alternatively, the second camera 14 may be a rearview camera that optically capture an image of a rear scene when a review camera is also used as the first camera 12. The second camera 14 optically captures an image of a scene ahead of the host vehicle HV. The second camera 14 in this embodiment dedicatedly serves to obtain image data of target vehicles TV ahead of the host vehicle HV. More specifically, the second camera 14 obtains image data of at least one side (the rear side in this embodiment) of each target vehicle TV. The second camera 14 is connected to the processing unit 20 using serial communication, and image data of target vehicles TV captured by the second camera 14 are transmitted to the processing unit 20.
In the present embodiment, the processing unit 20 may be formed of a memory 34 and a microprocessor (a dimension estimator) 36. Although the processing unit 20 is described and depicted as one component in this embodiment, it is merely shown as a block of main functions of the system 10, and actual processors performing these functions may be physically separated in the system 10.
The memory 34 may include a random access memory (RAM) and read-only memory (ROM) and store programs therein. The programs in the memory 34 may be computer-readable, computer-executable software code containing instructions that are executed by the microprocessor 36. That is, the microprocessor 36 carries out functions by performing programs stored in the memory 34.
The memory 34 also stores dimensional data (dimensional information) of a variety of vehicles which have been collected in advance. As shown in
The microprocessor 36 is configured to calculate the distance to the target vehicle TV and the motion information of the target vehicle TV as described above. Furthermore, the microprocessor 36 is configured to i) estimate a dimension of a target vehicle TV ahead of the host vehicle HV through a machine learning algorithm, ii) calculate a center position of the target vehicle TV, and iii) outputs the center position along with the dimension of the target vehicle TV to the DSRC radio 16.
In the present embodiment, the microprocessor 36 may be formed of a classifying portion 38, a determining portion 40, and a calculating portion 42. The classifying portion 38 is configured to identify the vehicle class (as described above) of the target vehicle TV through a classification process, as will be described later. The determining portion 40 is configured to determine the dimension of the target vehicle TV through a dimension extraction process based on the vehicle class identified by the classifying portion 38. The calculating portion 42 is configured to calculate the center position (i.e., latitude, longitude, and altitude) of the target vehicle TV through a center position calculating process based on the dimension determined by the determining portion 40 and the distance input from the first camera 12.
Next, the classification process, the dimension extraction process, and the center position calculating process will be described more detail. The following description is based on an exemplary scenario illustrated in
The classification process is basically based on Machine Learning (ML), more specifically, Deep Learning method.
In this embodiment, You Look Only Once (YOLO) and DenseNet are used as the ML architectures. Transfer learning concept is used to train the Convolutional Network (ConvNet), where a pre-trained ConvNet is used as an initialization for current specific dataset. The earlier layers of the ConvNet are kept fixed, whereas the higher layers of the ConvNet are fine-tuned to fit current needs. This is motivated by the fact that earlier layers of a ConvNet contain more generic features (e.g., edge detectors, color blob detectors, corners, etc.), whereas the higher layers contain features specific to the dataset of current interest. As shown in
For Proof of Concept (PoC), 12 classes representing common vehicles in the Midwest are considered. After deciding on the vehicles, thousands of images were downloaded for each class. Also, pictures are taken to build own dataset. Images downloaded include front, rear and side view for the vehicles. The photos manually taken were all rear-view. The training process with DenseNet starts with training the network with CIFAR-10 dataset to get the initial weights of the network. Then, using the derived initial weights, the network trained on 196 classes from Stanford dataset. Finally, the network is fine-tuned using the 12 classes and tested the performance of classification using Confusion Matrix Method. The result shows that accuracy varies between 92% to 97%. The detection and classification is very fast and takes 9-12 frames per second on average.
In the dimension extraction process, the system (more specifically, the determining portion 40) calls the loo-up table stored in the memory 34 to determine the dimension of the target vehicle TV. By referring to the lookup table using the vehicle class identified by the classifying portion, the dimension of the target vehicle TV is determined (extracted) and output to the calculating portion 42 for the center position calculation.
In the center position calculating process, calculation of the center position of the target vehicle TV may be performed as follows (refer to
The width (Δy,RV) and length (Δx,RV) of the target vehicle TV is contained in the dimension extracted by the determining portion 40. It should be noted that this scenario does not show the offsets in y axis because it is assumed to be zero. It should be also understood that the center position calculation method applies to the scenario when the host vehicle HV is right behind the target vehicle TV as shown in
The center position (x,y) of the target vehicle TV is calculated with respect to the center position of the host vehicle HV. Then, the calculated position (x,y) of the target vehicle TV is transformed into the coordinates (Lat, Long) and shared with other vehicles via V2V and/or V2X communication.
Next, the message structure for information sharing according to the present embodiment will be described below. The structure of the DSRC message used for the present disclosure is schematically shown in
Next, operation of the system 10 according to the present embodiment will be described below with reference to
When the first and second cameras 12, 14 detect the target vehicles TV (i.e., when the target vehicles TV are within the maximum recognition ranges of the first and second cameras 12, 14) at Step 100, the microprocessor 36 calculates a distance to each target vehicle TV at Step 110 based on image data captured by the first camera 12. The distances to the target vehicles TV are output to the microprocessor 36. The microprocessor 36 also calculates motion information regarding each target vehicle TV based on the image data captured at Step 120. That is, the microprocessor 36 calculates a velocity, acceleration, and a heading of each target vehicle TV. More specifically, the microprocessor 36 maintains an update cycle (every T sec), where the motion information is continuously updated. This update will continue until dimensions and the center position of the target vehicle TV are obtained.
At Step 130, the classifying portion 38 performs the classification process using the image data of the target vehicles TV through the machine learning algorism as described with
At Step 170, the calculating portion 42 performs the center position calculating process using the mathematical computing as described above. When the calculating portion 42 calculates the center position at Step 180, the center position of each target vehicle TV is output to the DSRC radio 16 together with other information (i.e., the dimension, the motion information and the positional information) at the end of the cycle. Then, the DSRC radio 16 creates BSMs containing the dimension, the center position, the positional information, and the motion information for each target vehicle TV as well as the BSM original data (see
As described above, the system 10 according to the present embodiment is able to estimate the dimension of the target vehicle TV based on image data of one side (the rear side in this embodiment) of the target vehicle. Therefore, the system 10 can obtain the accurate dimension of the target vehicle TV even if only the rear side of the target vehicle TV is visible to the host vehicle HV.
Furthermore, the system 10 is able to obtain the center position of the target vehicle TV based on the dimension estimated. Therefore, the host vehicle HV is able to share the dimension and the center position of the target vehicle TV with the remote vehicle RV via the V2V communication and/or V2X communication. Accordingly, the remote vehicle RV can obtain the accurate dimensional information and the center position of the target vehicle TV even if the remote vehicle RV does not recognize (capture) the target vehicle.
With reference to
In the above-described embodiment, the second camera 14 obtains image data of the rear side of the target vehicle TV and the first camera 12 obtains image data to obtain a distance to the target vehicle TV ahead of the host vehicle HV. However, the dimension of the target vehicle TV may be estimated from any side of the target vehicle TV (i.e., the front side, one of the right and left sides). For example, the second camera 14 may capture an image of a front side of the target vehicle HV, and then the system 10 may obtain the dimension of the target vehicle TV based on the image of the front side. In this case, the first camera 12 also obtains image data of a rear view and the system 10 may calculate a distance to the host vehicle behind of the host vehicle HV based on the image data obtained by the first camera 12.
In the above-described embodiment, a camera (the first camera 12) is used as a distance measuring sensor to obtain a distance to the target vehicle TV. Alternatively, other sensors, such as LiDAR, LADAR and so on, or their combination may be used to measure a distance (and motion information) to the target vehicle TV. Furthermore, the second camera 14 may be used as a distance measuring sensor. That is, the second camera 14 obtains image data of one side of the target vehicle TV, and then the second camera 14 may calculate a distance to the target vehicle TV based on image data of the target vehicle TV as with the first camera 12 in the embodiment. In this case, the first camera 12 can be eliminated, and the second camera 14 serves both as the image sensor and the distance measuring sensor.
The foregoing description of the embodiments has been provided for purposes of illustration and description. It is not intended to be exhaustive or to limit the disclosure. Individual elements or features of a particular embodiment are generally not limited to that particular embodiment, but, where applicable, are interchangeable and can be used in a selected embodiment, even if not specifically shown or described. The same may also be varied in many ways. Such variations are not to be regarded as a departure from the disclosure, and all such modifications are intended to be included within the scope of the disclosure.
Example embodiments are provided so that this disclosure will be thorough, and will convey the scope to those who are skilled in the art. Numerous specific details are set forth such as examples of specific components, devices, and methods, to provide a thorough understanding of embodiments of the present disclosure. It will be apparent to those skilled in the art that specific details need not be employed, that example embodiments may be embodied in many different forms and that neither should be construed to limit the scope of the disclosure. In some example embodiments, well-known processes, well-known device structures, and well-known technologies are not described in detail.
The terminology used herein is for the purpose of describing particular example embodiments only and is not intended to be limiting. As used herein, the singular forms “a,” “an,” and “the” may be intended to include the plural forms as well, unless the context clearly indicates otherwise. The terms “comprises,” “comprising,” “including,” and “having,” are inclusive and therefore specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. The method steps, processes, and operations described herein are not to be construed as necessarily requiring their performance in the particular order discussed or illustrated, unless specifically identified as an order of performance. It is also to be understood that additional or alternative steps may be employed. As used herein, the term “and/or” includes any and all combinations of one or more of the associated listed items.
This application claims the benefit of U.S. Provisional Application No. 62/697,660 filed on Jul. 13, 2018. The entire disclosures of each of these provisional patent applications are incorporated herein by reference.
Number | Date | Country | |
---|---|---|---|
62697660 | Jul 2018 | US |