The technology of the present disclosure relates to a dimension estimation device, a dimension estimation method, and a dimension estimation program.
There is a technology of detecting another object appearing in a video of an in-vehicle camera such as a drive recorder. For example, an object detection system represented by YOLO is known.
In the field of automatic driving, for example, it is necessary to take measures such as measuring the timing of a lane change on the basis of a dimension of another vehicle around an observation vehicle. As a method of obtaining the dimension, there is a method of specifying the vehicle name of the vehicle shown in the video and referring to the catalog specification (see Non Patent Literature 1).
Non Patent Literature 1: “Shamei ninshiki system powered by Zinrai (Vehicle name recognition system powered by Zinrai)”, URL:
“https://www.fujitsu.com/jp/products/network/managed-services-network/transport/name-recognition/”
However, since vehicles designed with unique dimensions frequently appear for each new vehicle name, it is necessary to update the learning content of a representative object detection system each time such vehicle appears, and the running cost increases. In addition, in the object detection system, specification at a vehicle type level is common, and a dedicated system such as Non Patent Literature 1 described above is not common. The vehicle type referred to herein refers to a rough classification such as a passenger car, a truck, and a bus. For example, there are various types of the truck such as a truck whose entire length is specified as 3.4 m or less like a light truck and a large truck whose entire length is specified as 12 m or less. Therefore, enabling dimensions to be obtained in the classification contributes to improvement of versatility.
The disclosed technology has been made in view of the above points, and an object thereof is to provide a dimension estimation device, a dimension estimation method, and a dimension estimation program capable of accurately estimating a dimension of another object on a road.
A first aspect of the present disclosure is a dimension estimation device including: an acquisition unit that acquires an image acquired in a moving body; an image information calculation unit that calculates a vanishing point and a predetermined decision line in the image; a detection unit that detects an object in the image and detects a detection rectangle that is a rectangle including a first surface that is a surface close to a photographing position, the first surface being in proximity to the decision line for the object; and a dimension estimation unit that estimates a dimension of the object on the basis of a ratio between a height of the first surface and a height of a second surface that is a surface far from the photographing position, obtainable from a relationship between the detection rectangle and the vanishing point.
A second aspect of the present disclosure is a dimension estimation method including causing a computer to execute processing of: acquiring an image acquired in a moving body; calculating a vanishing point and a predetermined decision line in the image; detecting an object in the image and detecting a detection rectangle that is a rectangle including a first surface that is a surface close to a photographing position, the first surface being in proximity to the decision line for the object; and estimating a dimension of the object on the basis of a ratio between a height of the first surface and a height of a second surface that is a surface far from the photographing position, obtainable from a relationship between the detection rectangle and the vanishing point.
A third aspect of the present disclosure is a dimension estimation program that causes a computer to execute processing of: acquiring an image acquired in a moving body; calculating a vanishing point and a predetermined decision line in the image; detecting an object in the image and detecting a detection rectangle that is a rectangle including a first surface that is a surface close to a photographing position, the first surface being in proximity to the decision line for the object; and estimating a dimension of the object on the basis of a ratio between a height of the first surface and a height of a second surface that is a surface far from the photographing position, obtainable from a relationship between the detection rectangle and the vanishing point.
According to the disclosed technology, it is possible to accurately estimate a dimension of another object on a road.
Hereinafter, examples of an embodiment of the disclosed technique will be described with reference to the drawings. Note that, in the drawings, the same or equivalent components and portions are denoted by the same reference numerals. Further, dimensional ratios in the drawings are exaggerated for convenience of description and thus may be different from actual ratios.
First, an overview of the present disclosure will be described. Hereinafter, in the example described in the present embodiment, a case of estimating the dimension of the entire length of another vehicle captured from an observation vehicle that is a moving body will be described. In the image, a case is also assumed where the entire length is not reflected, such as a case where the form of another vehicle is rounded. However, a method of the present embodiment can be similarly applied also to such a case. The estimation result obtained by the method of the present embodiment can be evaluated as, for example, a distribution of dimensions of other vehicles existing on a road on which the observation vehicle has traveled. In the evaluation, when it is decided that a truck, a bus, or the like having a long entire length frequently passes through, it is assumed that the estimation result is to be used as a reference for an expansion plan of a road or an intersection. For example, in a case of an intersection where a large truck of about 12 m frequently turns right/left, the estimation result can be utilized for considering an increase in the number of lanes and a change in the signal cycle.
In addition, crowdedness and traffic congestion (traffic congestion for each lane) in units of lanes can be detected from the number of vehicles that overtake passenger cars/trucks/buses per unit time.
In the estimation result obtained by the method of the present embodiment, for example, the density in the road or the presence or absence of a blank area can be recognized by recognizing a dimension of a vehicle parked on the road that the observation vehicle is about to overtake while traveling or a dimension of a surrounding vehicle that is traveling. Such recognition can be used to assist the automatic driving for deciding whether the observation vehicle may perform the operation in the automatic driving operation such as on-road parking or lane change.
An existing technology utilized in the present embodiment will be described. In the present embodiment, the principle of the camera as described in Reference Document 1 is utilized.
When a monocular camera such as a general in-vehicle camera is used, a feature such as another vehicle or a building is drawn so as to converge with respect to a vanishing point. When an image that is not affected by lens distortion is used, such as when a non-wide-angle lens is used or when a region with less distortion of a wide-angle lens is cut out, the size of an object appearing in the image can be approximated as in a perspective view, and changes according to a law with respect to a distance from a reference point. The size of the object shown in the image is expressed by “side length”, and the area changes by the square of the side length in the case of a square.
Hereinafter, a configuration of the present embodiment will be described.
As illustrated in
The CPU 11 is a central processing unit and executes various programs and controls each unit. That is, the CPU 11 reads a program from the ROM 12 or the storage 14, and executes the program by using the RAM 13 as a work area. The CPU 11 controls each component described above and performs various types of calculation processing according to the program stored in the ROM 12 or the storage 14. In the present embodiment, a dimension estimation program is stored in the ROM 12 or the storage 14.
The ROM 12 stores various programs and various types of data. The RAM 13 temporarily stores programs or data as a work area. The storage 14 includes a storage device such as a hard disk drive (HDD) or a solid state drive (SSD) and stores various programs including an operating system and various types of data.
The input unit 15 includes a pointing device such as a mouse and a keyboard, and is used to perform various inputs.
The display unit 16 is, for example, a liquid crystal display and displays various types of information. The display unit 16 may function as the input unit 15 by adopting a touch panel system.
The communication interface 17 is an interface communicating with another device such as a terminal. For the communication, for example, a wired communication standard such as Ethernet (registered trademark) or FDDI, or a wireless communication standard such as 4G, 5G, or Wi-Fi (registered trademark) is used.
Next, each functional configuration of the dimension estimation device 100 will be described.
As illustrated in
The acquisition unit 110 acquires time-series images from the video imaged by the in-vehicle camera of the observation vehicle. The time-series images are frame-by-frame images. Further, the acquisition unit 110 acquires camera information of the in-vehicle camera that has imaged the video. The camera information includes information on coordinates of an installation position of the in-vehicle camera.
The image information calculation unit 112 calculates a vanishing point and a decision line of an image by processing of each unit.
The vanishing point calculation unit 120 calculates a vanishing point from the image acquired by the acquisition unit 110. The vanishing point calculation unit 120 detects straight lines in the image by using Hough transform, probabilistic Hough transform, or the like to extract straight line components, and performs extension processing as necessary to obtain an intersection point between the straight lines and calculate a vanishing point. For the calculation of the vanishing point, for example, when it is known that the road on which the vehicle is traveling curves left and right, the processing may be performed with the range limited so as to use only the straight line component detected in the lower region of the image having less influence of the curve.
The decision line calculation unit 122 calculates the photographable range from the position and the angle of view of the camera in the camera information acquired by the acquisition unit 110, and calculates the decision line by drawing a line at a position half the distance of the lower part of the photographable range in the image.
Regarding the calculation of the decision line, for example, in a case where the road surface is not flat and there is an inclination, the inclination may be considered. On the other hand, if the inclination is about 10 m ahead, the change in height may be regarded as minor, and the influence may not be considered. Regarding the calculation of the decision line, instead of calculating the decision line on the basis of the height of the in-vehicle camera from the road surface, marking may be performed on the road surface in front of the in-vehicle camera at regular intervals, and the coordinates on the image at which the distance from the in-vehicle camera is constant may be set to be the decision line regardless of the height of the in-vehicle camera. The calculation of the vanishing point and the calculation of the decision line may be calculated by extracting a part of the image in the entire video, and in the case of the video of the same vehicle, the coordinate information calculated once may be reused.
Next, an outline of processing related to the following detection unit 114 and dimension estimation unit 116 will be described with reference to
The detection unit 114 detects an object in the image, and detects a detection rectangle that is a rectangle including the first surface in proximity to the decision line. The first surface is a surface close to the photographing position, and conversely, is a surface far from the vanishing point. In the example, the first surface is the rear side. The second surface is a surface far from the photographing position, and conversely, is a surface close to the vanishing point. In the example, the second surface is the front side. In the subsequent processing, as illustrated in
The dimension estimation unit 116 estimates a dimension of another object on the basis of a reduction ratio being a ratio between the height of the first surface and the height of the second surface obtainable from a relationship between the detection rectangle and the vanishing point. Inputs in the dimension estimation unit 116 are an extracted image, a vanishing point, a decision line, and information on the coordinates of the detection rectangle.
The storage unit 118 stores a linear proportional equation corresponding to the distance D.
The detection unit 114 detects another vehicle as an object by an object detection technology, and detects a detection rectangle. In the present embodiment, since the detection rectangle of the vehicle is calculated on the image, the information on the coordinates of the rectangular shape (K1) of the plane of
The detection unit 114 further detects an object on a “rear surface (rear)” of the vehicle. As illustrated in
The positional relationship between R1 and F1 exists on a straight line facing the direction of the vanishing point. Here, a case where the entire length of the front vehicle traveling in the same direction as the observation vehicle is obtained using the video of the in-vehicle camera on the front side is taken as an example. However, actually, an oncoming vehicle also exists in the video, and the front may appear as the first surface. When the rear camera is used, the front of the following vehicle traveling in the same direction as the observation vehicle also appears as the first surface. Therefore, the detection unit 114 may detect not only the rear but also the front.
The dimension estimation unit 116 estimates a dimension using the detection rectangle. As illustrated in
The height (H1) of the detection rectangle of R1 and the length (H2) of the line segment corresponding to the height of F1 of the circumscribed rectangular parallelepiped are obtainable. Then, the ratio of these is obtained as reduction ratio=H2/H1. Here, as described above, the height of the circumscribed rectangular parallelepiped changes depending on the distance D, but the change is inversely proportional. The dimension estimation unit 116 obtains the entire length of the circumscribed rectangular parallelepiped of the target from the expression stored in the storage unit 118 and the obtained reduction ratio. The smaller the size is, that is, the smaller the H2/H1 is, the longer the entire length is.
A method of estimating the entire length dimension will be described. As illustrated in
Here, since D2 is the entire length of D1+the target, the entire length (dimension) of the target can be obtained by the following Expression (2) by modifying Expression (1).
Next, the action of the dimension estimation device 100 will be described.
In step S100, the CPU 11, as the acquisition unit 110, acquires time-series images and camera information of the in-vehicle camera from the video captured by the in-vehicle camera of the observation vehicle.
In step S102, the CPU 11, as the image information calculation unit 112, calculates a vanishing point of the image. Details of the action of the vanishing point calculation by the vanishing point calculation unit 120 will be described later.
In step S104, the CPU 11, as the image information calculation unit 112, calculates a decision line of the image. Details of the action of the decision line calculation by the decision line calculation unit 122 will be described later.
In step S106, the CPU 11, as the detection unit 114, detects an object in the image by using an object detection technology.
In step S108, the CPU 11, as the detection unit 114, detects a detection rectangle that is a rectangle including the first surface in proximity to the decision line for the object from each of the images. As a result, an extracted image including the first surface in proximity to the decision line is obtained, and this extracted image is used in the subsequent processing.
In step S110, the CPU 11, as the dimension estimation unit 116, obtains line segments connecting the four corners of the first surface of the detection rectangle and the vanishing point, and calculates the height of the first surface.
In step S112, the CPU 11, as the dimension estimation unit 116, calculates an intersection corresponding to the line segment and the second surface of the detection rectangle, and calculates a line connecting the intersections as the height of the second surface corresponding to the height of the front.
In step S114, the CPU 11, as the dimension estimation unit 116, compares the height of the first surface with the height of the second surface to calculate the reduction ratio.
In step S116, the CPU 11, as the dimension estimation unit 116, acquires the decision formula from the storage unit 118 according to the distance to the decision line, estimates the entire length according to the reduction ratio, and outputs the estimation result.
The calculation of the vanishing point will be described with reference to
In step S200, the CPU 11, as the vanishing point calculation unit 120, detects a partition line in the image and a plurality of straight lines by a structure such as a building by using Hough transform, probabilistic Hough transform, or the like.
In step S202, the CPU 11 performs correction processing on the straight line as necessary to extend the straight line.
In step S204, the CPU 11 calculates a point at which the straight lines intersect as a vanishing point.
The calculation of the decision line will be described with reference to
In step S300, the CPU 11, as the decision line calculation unit 122, calculates the photographable range from the position and the angle of view of the camera in the camera information.
In step S302, the CPU 11, as the decision line calculation unit 122, calculates a decision line by drawing a line at the lower half position of the photographable range.
As described above, according to the dimension estimation device 100 of the present embodiment, a dimension of another object on a road can be accurately estimated.
When the present embodiment is applied, as illustrated in the example of
Evaluation of the method of the present embodiment will be described with reference to
For convenience of description, in the above example, the feature is described as a circumscribed rectangular parallelepiped, but the present invention is not limited thereto. As illustrated in
As illustrated in
The method using the circumscribed rectangular parallelepiped of the target has been exemplified, but another rectangular parallelepiped may be used. As illustrated in
Although the case of targeting a passenger car, a truck, and a bus has been described as an example, the present embodiment is not limited thereto. For example, if a feature can be expressed by a cube or a rectangular parallelepiped facing a vanishing point in a video of the in-vehicle camera, for example, a motorcycle or a bicycle can also be used. As similar to this, for example, when an in-vehicle camera is used, it is also possible to obtain dimensions of a building or a constructed object such as a road transformer built in parallel with a road. In a case of a target not on a road, for example, the present embodiment can also be used for applications such as calculating an optimal route by obtaining the entire length of cardboard in front while an automatic delivery robot is moving along a corridor of a building.
An example in which the installation orientation of the in-vehicle camera is assumed to be installed at the front of the observation vehicle and directed forward has been described, but the present embodiment is not limited thereto. For example, a rear camera installed on the rear and facing rearward may be used. As similar to this, in a case of a side camera, the dimension of the “width” of a feature such as a vehicle traveling around can be obtained using the same method.
The entire length of the target can be obtained by applying a method of determining a decision line, drawing a line horizontally from each intersection coordinate, and obtaining how many meters the line is ahead of the in-vehicle camera. The entire length can also be obtained from the difference in lateral width of the side surface. However, the appearance of the side surface varies depending on the distance between the camera and another vehicle in the lateral direction in the real space. Therefore, it is necessary to perform a process of estimating the traveling lane of each other vehicle from the obtained inclination or the like of the line segment connecting the detection rectangle (front or rear) and the vanishing point, acquiring the width information of each traveling lane from the map data or the like, and evaluating for each condition. In addition, dimensions other than the entire length of the vehicle can also be calculated. As a result, for example, the width and the height can be estimated. Thus, by utilizing the decision line calculation method, dimensions other than the entire length of the vehicle can also be calculated.
The decision line calculation unit 122 may calculate a plurality of decision lines. In this case, for example, if two decision lines having different distances from the distance D1 and the distance D2 are calculated, the decision formula corresponding to the distance to the decision line acquired from the storage unit 118 is different. The detection unit 114 is only required to detect a detection rectangle including the first surface in proximity for each of the plurality of decision lines.
Note that the dimension estimation processing executed by the CPU reading software (program) in the above embodiment may be executed by various processors except the CPU. Examples of the processors in this case include a programmable logic device (PLD) of which a circuit configuration can be changed after manufacturing, such as a field-programmable gate array (FPGA), and a dedicated electric circuit that is a processor having a circuit configuration exclusively designed for executing specific processing, such as an application specific integrated circuit (ASIC). The dimension estimation process may be executed by one of these various processors or may be executed by a combination of the same type or different types of two or more processors (for example, a plurality of FPGAs, a combination of a CPU and an FPGA, or the like). Further, a hardware structure of the various processors is, more specifically, an electric circuit in which circuit elements such as semiconductor elements are combined.
In the above embodiment, an aspect in which the dimension estimation program is stored (installed) in advance in the storage 14 has been described, but the embodiment is not limited thereto. The program may be provided in a form of a program stored in a non-transitory storage medium such as a compact disk read only memory (CD-ROM), a digital versatile disk read only memory (DVD-ROM), or a Universal Serial Bus (USB) memory. In addition, the program may be downloaded from an external device via a network.
With regard to the above embodiments, the following supplementary notes are further disclosed.
A dimension estimation device including:
A non-transitory storage medium in which a program executable by a computer to execute dimension estimation processing, the dimension estimation processing including:
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/JP2021/028218 | 7/29/2021 | WO |