This application claims priority to Japanese Patent Application No. 2023-99794 filed on Jun. 19, 2023, which is incorporated herein by reference in its entirety.
The present disclosure relates to a detection device, a position calculation system, and a detection method.
Vehicles that run by unmanned driving have hitherto been known (Japanese Laid-Open Patent Application (PCT Application) Publication No. 2017-538619).
In some cases, in order to cause a moving object such as a vehicle to move by unmanned driving, images of the moving object are taken from the outside. However, the inventors have found that the accuracy of detection of the moving object in captured images may decrease due to differences of the exterior shape of the moving object depending on the body type.
The present disclosure may be realized by the following aspects.
(1) According to a first aspect of the present disclosure, a detection device that detects a moving object included in a captured image is provided. The detection device comprises: an image acquisition unit that acquires the captured image; a type acquisition unit that acquires type information indicating one body type of the moving object included in the captured image, the one body type being classified according to an exterior shape of the moving object; a model acquisition unit that acquires, from among a plurality of first detection models each of which is a machine learning model prepared for each of the one body type, the first detection model selected according to the one body type identified by the type information acquired by the type acquisition unit; and a detection unit that detects the moving object included in the captured image by inputting the captured image to the first detection model acquired by the model acquisition unit and identifying a target region indicating the moving object in the captured image. According to the above aspect, it is possible to acquire the type information indicating one body type of the moving object included in the captured image, and acquire the first detection model selected according to the one body type identified by the acquired type information from among a plurality of first detection models prepared for the respective one body types. Then, by inputting the captured image to the first detection model according to the body type of the moving object included in the captured image, the moving object included in the captured image can be accurately detected. That is, by using the first detection model suitable for the detection of the moving object classified into one body type, the respective regions that constitute the captured image can be accurately classified into target regions and regions other than target regions. This can suppress a decrease in accuracy in the detection of the moving object included in the captured image depending on the body type of the moving object.
(2) In the above-described aspect, each of the plurality of first detection models is trained in advance to identify the target region by inputting a plurality of training images including M (M is an integer of 2 or more) first training images each containing the moving object classified into the one body type and N (N is an integer of 0 or more and less than M) second training images each containing the moving object classified into another type of the body type different from the one body type. According to the above aspect, the first detection model suitable for the detection of the moving object classified into one body type can be prepared for each one body type, and such a first detection model is trained with more first training images containing moving objects classified into one body type than the second training images containing moving objects classified into another body type other than the one body type.
(3) According to a second aspect of the present disclosure, a detection device that detects a moving object included in a captured image is provided. The detection device comprises: an image acquisition unit that acquires the captured image; a type acquisition unit that acquires type information indicating one body type of the moving object included in the captured image, the one body type being classified according to an exterior shape of the moving object; and a detection unit that detects the moving object included in the captured image by inputting the captured image and the one body type identified by the type information acquired by the type acquisition unit to a second detection model being a machine learning model and identifying a target region indicating the moving object in the captured image. According to the above aspect, it is possible to accurately detect the moving object included in the captured image by inputting the captured image associated with the one body type to the second detection model. That is, the respective regions that constitute the captured image can be accurately classified into target regions and regions other than target regions. This can suppress a decrease in accuracy in the detection of the moving object included in the captured image depending on the body type of the moving object.
(4) In the above-described aspect, the second detection model is trained in advance to identify the target region by inputting a plurality of training images containing the moving object and a type correct answer label, the type correct answer label being associated with each of the plurality of training images and indicating the body type of the moving object included in the training images. According to the above aspect, it is possible to prepare the second detection model that is trained by associating a plurality of training images containing the moving objects and type correct answer labels indicating the body types of the moving objects included in the respective training images.
(5) In the above-described aspect, a region correct answer label indicating either the target region or an out-of-target region indicating other than the moving object is associated with each region in the training images. According to the above aspect, the respective regions that constitute the captured image can be accurately classified into target regions and regions other than target regions. This can improve the accuracy in the detection of the moving object included in the captured image.
(6) In the above-described aspect, the body type indicates a type of the moving object. According to the above aspect, it is possible to detect the moving object included in the captured image using either the first detection model or the second detection model that has learned the difference in the exterior shape of the moving object that varies depending on the type of the moving object. This can improve the accuracy in the detection of the moving object included in the captured image.
(7) In the above-described aspect, the moving object moves by receiving a movement condition that defines a movement operation of the moving object from outside the moving object, and the type acquisition unit acquires the type information by identifying the one body type of the moving object included in the captured image using schedule information, the schedule information being associated with each imaging range of the plurality of imaging devices that acquire the captured image and indicating a scheduled time at which the moving object classified into the one body type moves in the imaging range. According to the above aspect, it is possible to acquire type information regarding the moving object included in the captured image by identifying one body type of the moving object included in the captured image using the schedule information.
(8) In the above-described aspect, the type acquisition unit acquires the type information by inputting the captured image to an identification model being a machine learning model that has been trained to output the type information when the captured image is input. According to the above aspect, it is possible to acquire the type information regarding the moving object included in the captured image by inputting the captured image to an identification model that is trained to output the type information regarding the moving object included in the captured image.
(9) In the above-described aspect, the detection device may further comprise a rotation processing unit that rotates the captured image such that a movement direction of the moving object faces a predetermined direction. According to the above aspect, the captured image can be rotated such that the direction of the vector indicating the movement direction of the moving object faces a predetermined direction. In this way, the moving object included in the captured image can be detected with the direction of the vector indicating the movement direction of the moving object unified. This can improve the accuracy in the detection of the moving object included in the captured image.
(10) In the above-described aspect, the detection device may further comprise a distortion correction unit that corrects distortion of the captured image. According to the above aspect, the distortion of the captured image can be corrected. This can improve the accuracy in the detection of the moving object included in the captured image.
(11) According to a third aspect of the present disclosure, a position calculation system for calculating a position of a moving object included in a captured image is provided. The position calculation system comprises: the detection device according to the above aspect; and a position calculation device that calculates the position of the moving object, wherein the detection device further masks a target region being a region indicating the moving object in the captured image, thereby generating a first mask image in which a mask region is added to the target region, and the position calculation device comprises: a perspective transformation unit that generates a second mask image by performing perspective transformation of the first mask image; a coordinate point calculation unit that calculates a local coordinate point indicating a position of the moving object in a local coordinate system by correcting a first coordinate point using a second coordinate point, the first coordinate point being a specified vertex of a first bounding rectangle set in the mask region in the first mask image, the second coordinate point being a vertex indicating the same position as the first coordinate point in a second bounding rectangle set in the mask region in the second mask image; and a position transformation unit that transforms the local coordinate point into a moving object coordinate point indicating a position of the moving object in a global coordinate system using a distance of an imaging device from a reference point calculated based on a position of the imaging device in the global coordinate system and a distance of a predetermined positioning point of the moving object from the reference point. According to the above aspect, it is possible to generate the first mask image in which the target region indicating the moving object is masked in each region constituting the captured image, and the second mask image with the perspective-transformed first mask image. In this way, it is possible to calculate the local coordinate point by extracting the first coordinate point from the first mask image and the second coordinate point from the second mask image. This allows for more accurate calculation of local coordinate points. This can improve the accuracy in the calculation of the position of the moving object.
(12) According to a fourth aspect of the present disclosure, a detection method of detecting a moving object included in a captured image is provided. The detection method comprises: an image acquisition step of acquiring the captured image; a type acquisition step of identifying the body type of the moving object included in the captured image, the body type being classified according to an exterior shape of the moving object; a model acquisition step of acquiring, from among a plurality of first detection models each of which is a machine learning model prepared for each of the one body type, the first detection model selected according to the body type identified in the type acquisition step; and a detection step of detecting the moving object included in the captured image by inputting the captured image to the first detection model acquired in the model acquisition step and identifying a target region indicating the moving object in the captured image. According to the above aspect, it is possible to acquire the type information indicating one body type of the moving object included in the captured image, and acquire the first detection model selected according to the one body type identified by the acquired type information from among a plurality of first detection models prepared for the respective one body types. Then, by inputting the captured image to the first detection model according to the body type of the moving object included in the captured image, the moving object included in the captured image can be accurately detected. That is, by using the first detection model suitable for the detection of the moving object classified into one body type, the respective regions that constitute the captured image can be accurately classified into target regions and regions other than target regions. This can suppress a decrease in accuracy in the detection of the moving object included in the captured image depending on the body type of the moving object.
(13) According to a fifth aspect of the present disclosure, a detection method of detecting a moving object included in a captured image is provided. This detection method comprises: an image acquisition step of acquiring the captured image; a type acquisition step of acquiring type information indicating one body type of the moving object included in the captured image, the one body type being classified according to an exterior shape of the moving object; and a detection step of detecting the moving object included in the captured image by inputting the captured image and the one body type identified by the type information acquired in the type acquisition step to a second detection model being a machine learning model and identifying a target region indicating the moving object in the captured image. According to the above aspect, it is possible to accurately detect the moving object included in the captured image by inputting the captured image associated with the one body type to the second detection model. That is, the respective regions that constitute the captured image can be accurately classified into target regions and regions other than target regions. This can suppress a decrease in accuracy in the detection of the moving object included in the captured image depending on the body type of the moving object.
The present disclosure can be realized by various aspects other than the detection device, the position calculation system, and the detection method described above. For example, the present disclosure may be embodied in aspects of methods for producing a moving object, a detection device, and a position calculation system, a position calculation method, methods for controlling a detection device and a position calculation system, computer programs that execute the control method, as well as non-transitory storage mediums storing the computer programs, and the like.
The vehicle 10 is an example of a moving object. In the present disclosure, the “moving object” means an object capable of moving, and is a vehicle or an electric vertical takeoff and landing aircraft (so-called flying-automobile), for example. The vehicle may be a vehicle to run with a wheel or may be a vehicle to run with a continuous track, and may be a passenger car, a track, a bus, a two-wheel vehicle, a four-wheel vehicle, a construction vehicle, or a combat vehicle, for example. The vehicle includes a battery electric vehicle (BEV), a gasoline automobile, a hybrid automobile, and a fuel cell automobile. When the moving object is other than a vehicle, the term “vehicle” or “car” in the present disclosure is replaceable with a “moving object” as appropriate, and the term “run” is replaceable with “move” as appropriate.
The vehicle 10 is configured to be capable of running by unmanned driving. The “unmanned driving” means driving independent of running operation by a passenger. The running operation means operation relating to at least one of “run,” “turn,” and “stop” of the vehicle 10. The unmanned driving is realized by automatic remote control or manual remote control using a device provided outside the vehicle 10 or by autonomous control by the vehicle 10. A passenger not involved in running operation may be on-board a vehicle running by the unmanned driving. The passenger not involved in running operation includes a person simply sitting in a seat of the vehicle 10 and a person doing work such as assembly, inspection, or operation of switches different from running operation while on-board the vehicle 10. Driving by running operation by a passenger may also be called “manned driving.”
In the present specification, the “remote control” includes “complete remote control” by which all motions of the vehicle 10 are completely determined from outside the vehicle 10, and “partial remote control” by which some of the motions of the vehicle 10 are determined from outside the vehicle 10. The “autonomous control” includes “complete autonomous control” by which the vehicle 10 controls a motion of the vehicle 10 autonomously without receiving any information from a device outside the vehicle 10, and “partial autonomous control” by which the vehicle 10 controls a motion of the vehicle 10 autonomously using information received from a device outside the vehicle 10.
The imaging device 9 acquires the original image by capturing an imaging range RG including the vehicle 10 subjected to position calculation from outside of the vehicle 10. In the present embodiment, the imaging device 9 transmits the acquired original image to the detection device 5, together with the camera identification information and the acquisition time of the original image. The camera identification information is information such as a camera ID that identifies each of a plurality of imaging devices 9, or the like. The original image is a two-dimensional image made up of assembled pixels arranged in the XcYc plane of the camera coordinate system. The camera coordinate system is a coordinate system using the focal point of the imaging device 9 as the origin and having the coordinate axes indicated by the Xc axis and the Yc axis, which is orthogonal to the Xc axis. The original image contains at least two-dimensional data of the vehicle 10 subjected to the position calculation. The original image is preferably a color image, but may also be a gray image. The imaging device 9 is, for example, a camera equipped with an imaging element such as a CCD image sensor or CMOS image sensor, and an optical system.
In the present embodiment, the imaging device 9 acquires original images of a track 2 and the vehicle 10 running on the track 2 from an overhead view. The location and number of the imaging devices 9 are determined by considering the imaging range RG (angle of view) or the like of each imaging device 9 so as to capture the entire track 2 by the one or more imaging devices 9. Specifically, the imaging devices 9 are installed so that a first imaging range RG1, which is the imaging range RG of a first imaging device 901, and a second imaging range RG2, which is the imaging range RG of a second imaging device 902, overlap. The first imaging device 901 and the second imaging device 902 are adjacent to each other. Further, each imaging device 9 is installed at a position where an image of a predetermined positioning point 10e for a specific portion of the vehicle 10 running on the track 2 can be captured. In the present embodiment, the positioning point 10e is the rear end in the left side surface (hereinafter referred to as a left rear end) of the vehicle 10. The positioning point 10e may be a portion other than the left rear end of the vehicle 10. The imaging device 9 may also acquire information from the front, rear, side, etc. of the vehicle 10, in addition to the information from above the vehicle 10.
The vehicle 10 has a manned driving mode and a remote unmanned driving mode. In the manned driving mode, the running of the vehicle 10 is achieved by the manned driving described above. Specifically, in the manned driving mode, the running conditions of the vehicle 10 are generated by the driver on the vehicle 10 operating the steering wheel, the accelerator, and other input devices provided in the vehicle 10. By such operations, the vehicle 10 runs according to the generated running conditions. The running conditions mean conditions that specify the running operation of the vehicle 10. The running conditions include, for example, the running route, position/location, running speed, acceleration, and steering angle of the wheels of the vehicle 10. The remote unmanned driving mode has a remote manual driving mode and a remote automatic driving mode. In the remote manual driving mode, the running conditions of the vehicle 10 are generated by an operator operating an operator input device provided at a location different from that of the vehicle 10. The vehicle 10 thereby receives the running conditions generated by the operator input device and runs according to the received running conditions. The operator here corresponds to an external operator, which is described later. In the remote automatic driving mode, a remote control device 7 provided at a location different from that of the vehicle 10 creates control values that specify running operation of the vehicle 10 and transmits the control values to the vehicle 10. The vehicle 10 then receives the control values and performs automatic running according to the received control values. Specifically, in the remote automatic driving mode in the present embodiment, the remote control device 7 generates running control signals, which are described later, and transmits the running control signals to the vehicle 10. The running control signals contain control values that specify running operation, and represent running conditions.
The vehicle 10 runs in the remote automatic driving mode in a factory where the vehicle 10 is produced, for example, by executing a plurality of production steps. The factory is not limited to those present in a single building or those present at one property or one address. The factory may be present across multiple buildings, multiple properties, multiple addresses, and the like. At this time, the vehicle 10 may run not only on private roads but also on public roads. The vehicle 10 may run outside the factory by the remote unmanned driving mode.
The vehicle 10 is, for example, an electric vehicle, a hybrid vehicle, a fuel cell vehicle, a gasoline vehicle, or a diesel vehicle. The vehicle 10 may be a private vehicle such as a passenger car, and may also be a business-purpose vehicle such as a truck, a bus, a construction vehicle, and the like. The vehicle 10 may be at least one of a completed product, a semi-finished product, and an in-process product.
The vehicle 10 includes a driving device 110 for accelerating the vehicle 10, a steering device 120 for changing the traveling direction of the vehicle 10, and a braking device 130 for decelerating the vehicle 10. The driving device 110, the steering device 120, and the braking device 130 each include various actuators for causing the vehicle 100 to run. The vehicle 10 also includes a vehicle communication unit 140 for enabling communication with external devices via wireless communication, or the like, a vehicle control device 150, and an on-vehicle sensor group 160. The external devices refer to devices other than the own vehicle 10, such as the detection device 5, the position calculation device 6, the remote control device 7, the imaging device 9, etc., and another vehicle 10. The vehicle communication unit 140 is, for example, a wireless communication device. The vehicle communication unit 140 communicates with the external devices connected to a network Nt, for example, via an access point in the factory. The vehicle control device 150 has a CPU, a storage unit, and an input/output interface. In the vehicle control device 150, the CPU, the storage unit, and the input/output interface are connected to one another, for example, via an internal bus or interface circuit. The input/output interface communicates with internal devices, such as the driving device 110 mounted on the own vehicle 10. The input/output interface is communicatively connected to the vehicle communication unit 140. The on-vehicle sensor group 160 includes, for example, an on-vehicle camera, an on-vehicle radar, and an on-vehicle LIDAR, as sensors that acquire surrounding information indicating the state of the surrounding area of the vehicle 10. The on-vehicle camera captures the state of the surrounding area of the vehicle 10. The on-vehicle radar and the on-vehicle LiDAR detect objects present in the surrounding area of the vehicle 10. The structure of the vehicle 10 is not limited to that described above.
The communication unit 51 of the detection device 5 communicatively connects other devices, such as the vehicle control device 150, the position calculation device 6, the remote control device 7, the imaging device 9, and the like, with the detection device 5. The communication unit 51 of the detection device 5 is, for example, a wireless communication device.
The storage unit 53 of the detection device 5 stores various types of information including various programs that control the operation of the detection device 5, a first detection model Md1 as a detection model, an identification model Md3, and a distortion correction parameter Pa1. The storage unit 53 of the detection device 5 includes, for example, RAM, ROM, and a hard disk drive (HDD).
The detection model is a trained machine learning model used to detect the vehicle 10 included in a captured image. In the present embodiment, the storage unit 53 of the detection device 5 stores, as detection models, a plurality of first detection models Md1 prepared for each one body type. In the present embodiment, the body type is a group of vehicle types when vehicles are classified into multiple types of vehicle 10 depending on the exterior shape of the vehicle 10. In this case, the vehicle 10 is classified into one body type depending on its vehicle class (also called “vehicle body” or “body size”) determined by, for example, the overall length, the vehicle width, and the vehicle height of the vehicle 10, as well as the shape of the vehicle 10. In the present embodiment, the body types include, for example, “SUV,” “sedan,” “station wagon,” “minivan,” “one-box,” “compact car,” and “light vehicle.” The body types are not limited to those listed above.
The first detection model Md1 is a machine learning model trained by inputting a first training data set. The first training data set is prepared for each one body type. The first training data set has a plurality of training images that include the vehicle 10, and region correct answer labels, which are associated with each of the plurality of regions that constitutes each training image. Each region correct answer label is a correct answer label indicating whether each region in the training image is a target region indicating the vehicle 10 or an out-of-target region indicating other than the vehicle 10. In the present embodiment, each region that constitutes a training image is a single pixel that constitutes the training image. Each region that constitutes the training image may be multiple pixels that constitute the training image. Each region that constitutes a training image is classified into either a target region or an out-of-target region, for example, according to the calculation result of comparison between RGB values of adjacent regions that constitute the training image. Each region that constitutes a training image may also be classified into either a target region or an out-of-target region according to the calculated probability obtained by calculation of probability of being a target region. When a region correct answer label is added to each region that constitutes a training image, it is associated with information indicating the position of the region to which the region correct answer label is added in the training image. If each region that constitutes the training image is a single pixel, the information indicating its position in the training image is, for example, pixel coordinates indicating the pixel position in the training image.
The first training data set includes M (M is an integer of 2 or more) first training images and N (N is an integer of 0 or more and less than M) second training images, as a plurality of training images. The first training image is an image containing a vehicle 10 that is classified into one body type. The second training image is an image containing a vehicle 10 that is classified into another body type different from the one body type. The number M of the first training images in the first training data set may be, for example, 1.5 times or more or 2 times or more of the number N of the second training images. When there are a plurality of other body types different from the one body type, the number M of the first training images in the first training data set is, for example, greater than the total number of the second training images that include vehicles 10 each classified into the plurality of other body types different from the one body type. In other words, the first detection model Md1 is a machine learning model that is trained with more first training images than the second training images, so that the first detection model Md1 preferentially learns the feature of the vehicle 10 classified into one body type.
In the present embodiment, when a captured image is input, the first detection model Md1 identifies a target region(s) from among the regions that constitute the input captured image. The first detection model Md1 then masks the target region to generate a first mask image in which a mask region is added to the target region.
The algorithm of the first detection model Md1 is, for example, a deep neural network (hereinafter referred to as “DNN”) with the structure of a convolutional neural network (hereinafter referred to as “CNN”) for implementing semantic segmentation or instance segmentation. The structure of the first detection model Md1 is not limited to those described above. The first detection model Md1 may be, for example, a trained machine learning model with an algorithm other than a neural network.
The identification model Md3 is a trained machine learning model used to identify one body type of the vehicle 10 included in the captured image and acquire type information indicating the identified one body type. Specifically, the identification model Md3 is a machine learning model that is trained to output the type information regarding the vehicle 10 included in the captured image when the captured image is input. In order to identify one body type of the vehicle 10, the identification model Md3 is trained to learn feature according to the body type. The feature according to the body type include, for example, the shape and the vehicle class of the vehicle 10. For example, CNN is used as the algorithm of the identification model Md3. The structure of the identification model Md3 is not limited to those described above. The identification model Md3 may be, for example, a trained machine learning model with an algorithm other than a neural network.
The distortion correction parameter Pa1 is a parameter used to correct the distortion of captured images. The details of the distortion correction parameter Pa1 are described later.
The image acquisition unit 521 acquires an original image from the imaging device 9, which is an external device.
The distortion correction unit 522 corrects distortions of captured images. In the present embodiment, the distortion correction unit 522 generates a corrected image that is obtained by correcting distortions of the original image.
The rotation processing unit 523 rotates a captured image so that the vector indicating the movement direction of the vehicle 10 (hereinafter referred to as “a movement vector”) faces a predetermined direction. In the present embodiment, the rotation processing unit 523 generates a rotated image by rotating a corrected image.
The trimming unit 524 deletes, from a captured image, other regions (hereinafter referred to as “unnecessary region”) than the region constituted of the vehicle 10 and the surrounding area of the vehicle 10 (hereinafter referred to as “necessary region”) in the respective regions that constitute the captured image. In this way, the trimming unit 524 cuts out the necessary region from the captured image. In the present embodiment, when the vehicle 10 has moved a distance exceeding a predetermined threshold, the trimming unit 524 deletes a post-movement region (unnecessary region) corresponding to the distance that the vehicle 10 has moved, from the rotated image. In this way, the trimming unit 524 generates a processed image that is obtained by cutting out a pre-movement region (necessary region) including the vehicle 10 from the rotated image.
The type acquisition unit 525 acquires type information indicating one body type of the vehicle 10 included in the captured image. In the present embodiment, the type acquisition unit 525 inputs the processed image to the identification model Md3 to acquire the type information indicating one body type of the vehicle 10 included in the captured image. The method of obtaining the type information is not limited to that described above.
The model acquisition unit 526 acquires the first detection model Md1 selected according to one body type identified by the type information acquired by the type acquisition unit 525. In the present embodiment, the model acquisition unit 526 selects and acquires a first detection model Md1 according to one body type identified by the type information acquired by the type acquisition unit 525, from among a plurality of first detection models Md1 prepared for the respective one body types stored in the storage unit 53 of the detection device 5. In other words, the model acquisition unit 526 acquires a first detection model Md1 that has been trained with more features of the vehicle 10 classified into one body type identified by the type information acquired by the type acquisition unit 525 than features of the vehicles 10 classified into body types other than the one body type. For example, if the body type of the vehicle 10 included in the captured image is identified as “SUV” according to the type information acquired by the type acquisition unit 525, the model acquisition unit 526 acquires a first detection model Md1a that has been trained with more vehicles 10 classified into “SUV” than the vehicles 10 classified into other body types.
The detection unit 527 detects the vehicle 10 included in the captured image by inputting the captured image to the first detection model Md1 acquired by the model acquisition unit 526 and identifying a target region(s) from among the regions that constitute the captured image. In the present embodiment, the detection unit 527 inputs the processed image to the first detection model Md1 acquired by the model acquisition unit 526. This identifies the target region from among the regions constituting the processed image, thereby generating the first mask image from the processed image.
The first transmission unit 528 transmits various types of information to other devices than the detection device 5. The first transmission unit 528 transmits, for example, the detection result by the detection unit 527 to the position calculation device 6. In the present embodiment, the detection result by the detection unit 527 is the first mask image. At least some of the functions of the detection device 5 may be implemented as a function of one of the vehicle control device 150, the position calculation device 6, the remote control device 7, and the imaging device 9.
The communication unit 61 of the position calculation device 6 communicatively connects the vehicle control device 150, the detection device 5, the remote control device 7, and the imaging device 9 with the position calculation device 6. The communication unit 61 of the position calculation device 6 is, for example, a wireless communication device.
The storage unit 63 of the position calculation device 6 stores various types of information including various programs that control the operation of the position calculation device 6, a perspective transformation parameter Pa2, and a camera database Db. The storage unit 63 of the position calculation device 6 includes, for example, RAM, ROM, and a hard disk drive (HDD).
The perspective transformation parameter Pa2 is a parameter used for perspective transformation of the first mask image. The details of the perspective transformation parameter Pa2 are described later.
The camera database Db is a database indicating imaging parameters calculated based on the installation position of the imaging device 9 in the global coordinate system for each imaging device 9. Each imaging parameter is a parameter regarding the distance of the imaging device 9 from a predefined reference point. In the present embodiment, the imaging parameter is a height H of the imaging device 9 from a road surface 20 (
The CPU 62 of the position calculation device 6 functions as a data acquisition unit 621, a perspective transformation unit 622, a coordinate point calculation unit 623, a position transformation unit 624, and a second transmission unit 625 by expanding various programs stored in the storage unit 63 of the position calculation device 6.
The data acquisition unit 621 acquires various types of information. The data acquisition unit 621, for example, acquires the detection result by the detection device 5 from the detection device 5. In the present embodiment, the data acquisition unit 621 acquires the first mask image from the detection device 5. The data acquisition unit 621 acquires the imaging parameter for the imaging device 9 from which the original image to be analyzed has been acquired by referring to the camera database Db stored in the storage unit 63 of the position calculation device 6.
The perspective transformation unit 622 generates a second mask image obtained by performing perspective transformation on the first mask image. The coordinate point calculation unit 623 corrects a first coordinate point using a second coordinate point, thereby calculating the local coordinate point. The first coordinate point is a coordinate point in the local coordinate system of the specified vertex of a first bounding rectangle set in the mask region in the first mask image. The second coordinate point is a coordinate point in the local coordinate system of a vertex indicating the same position as the first coordinate point from among the vertices of a second bounding rectangle set in the mask region in the second mask image. The local coordinate point is a coordinate point indicating the position of the vehicle 10 in the local coordinate system. The position transformation unit 624 transforms the local coordinate point into a vehicle coordinate point using the imaging parameter acquired by the data acquisition unit 621 and the local coordinate point calculated by the coordinate point calculation unit 623. The vehicle coordinate point is a coordinate point indicating the position of the vehicle 10 in the global coordinate system.
The second transmission unit 625 transmits various types of information to other devices than the position calculation device 6. For example, the second transmission unit 625 transmits the vehicle coordinate point to the remote control device 7 as information indicating the position of the vehicle 10. At least some of the functions of the position calculation device 6 may be implemented as a function of one of the vehicle control device 150, the detection device 5, the remote control device 7, and the imaging device 9.
The CPU 72 of the remote control device 7 functions as an information acquisition unit 721, a control value generation unit 722, and a third transmission unit 723 by expanding the various programs stored in the storage unit 73 of the remote control device 7.
The information acquisition unit 721 acquires various types of information. The information acquisition unit 721 acquires, for example, information (hereinafter referred to as “running information”) regarding running conditions of the vehicle 10. The running information includes, for example, the vehicle coordinate point transmitted from the position calculation device 6 as information indicating the position of the vehicle 10, running speed and actual steering angle of the vehicle 10 transmitted from the vehicle control device 150, and running route information stored in advance in the storage unit 73 of the remote control device 7. The running route information is information indicating the target running route of the vehicle 10 that runs in the remote automatic driving mode. The type of information included in the running information is not limited to those described above.
The control value generation unit 722 generates a control value that specifies the running operation of the vehicle 10 using the running information acquired by the information acquisition unit 721. Specifically, the control value generation unit 722 generates, for example, a reference control value and a modified control value. The reference control value is a control value for causing the vehicle 10 to run along the target running route. The modified control value is a control value for modifying the position of the vehicle 10 relative to the target running route. The reference control value and the modified control value include, for example, an acceleration control value that defines the acceleration of the vehicle 10 in the forward direction and a steering angle control value that defines the steering angle of the vehicle 10, respectively. The reference control value and the modified control value each may be a control value that includes either a trajectory control value or a destination control value, instead of the acceleration control value and the steering angle control value. The trajectory control value is a control value that defines the running trajectory of the vehicle 10 by arranging the target running positions of the vehicle 10 at predetermined times in chronological order. The destination control value is a control value indicating the target arrival time of the vehicle 10 at the target location.
The third transmission unit 723 transmits various types of information to other devices than the remote control device 7. The third transmission unit 723 transmits, for example, the control value generated by the control value generation unit 722 to the vehicle 10 subjected to the control. At least some of the functions of the remote control device 7 may be implemented as a function of one of the vehicle control device 150, the detection device 5, the position calculation device 6, and the imaging device 9.
As shown in
As shown in
As shown in
As shown in
As shown in
After the type acquisition step, a model acquisition step (step S6) is performed. The model acquisition step is a step of acquiring the first detection model Md1 selected according to one body type identified by the type information acquired in the type acquisition step, from among a plurality of first detection models Md1 prepared for the respective one body types. In the present embodiment, in the model acquisition step, the model acquisition unit 526 selects and acquires the first detection model Md1 according to one body type identified by the type information acquired in the type acquisition step, from among a plurality of first detection models Md1 stored in advance in the storage unit 53 of the detection device 5. It is sufficient that the model acquisition step is performed at an arbitrary time point after the type acquisition step is completed and before the detection step is started.
After the model acquisition step, a detection step is performed. As shown in
As shown in
As shown in
In the coordinate point calculation step, the coordinate point calculation unit 623 calculates a base coordinate point P0 from a first bounding rectangle R1 set in the mask region Ms in the first mask image Im5, which is the image before the perspective transformation.
Further, the coordinate point calculation unit 623 sets a second bounding rectangle R2 with respect to the mask region Ms in the second mask image Im6 obtained by perspective transformation of the first mask image Im5. Then, the coordinate point calculation unit 623 sets, as a second coordinate point P2, a vertex indicating the same position as the first coordinate point P1 from among the vertices of the second bounding rectangle R2. In other words, the first coordinate point P1 and the second coordinate point P2 are correlated with each other, as they are coordinate points indicating the same position.
Furthermore, the coordinate point calculation unit 623 performs a correction to replace the coordinates (Xi1, Yi1) of the first coordinate point P1 with the coordinates (Xi2, Yi2) of the second coordinate point P2 according to the relative magnitude between the coordinate values of the first coordinate point P1 and the second coordinate point P2. When the coordinate value Xi1 in the Xi direction of the first coordinate point P1 is greater than the coordinate value Xi2 in the Xi direction of the second coordinate point P2 (Xi1>Xi2), the coordinate point calculation unit 623 replaces the coordinate value Xi1 in the Xi direction of the first coordinate point P1 with the coordinate value Xi2 in the Xi direction of the second coordinate point P2. When the coordinate value Yi1 in the Yi direction of the first coordinate point P1 is greater than the coordinate value Yi2 in the Yi direction of the second coordinate point P2 (Yi1>Yi2), the coordinate point calculation unit 623 replaces the coordinate value Yi1 in the Yi direction of the first coordinate point P1 with the coordinate value Yi2 in the Yi direction of the second coordinate point P2. In the example shown in
As shown in
The position transformation unit 624 transforms the local coordinate point P3 into the vehicle coordinate point using the relational expressions in the formulae (1) to (3) described later, which include the vehicle coordinate point as the objective variable and the local coordinate point P3, the imaging parameter, and the vehicle parameter as the explanatory variables. In this case, the position transformation unit 624 substitutes the coordinate value of the local coordinate point P3 calculated by the coordinate point calculation unit 623 into the relational expressions represented by the formulae (1) to (3). Further, the position transformation unit 624 substitutes the imaging parameter acquired by the data acquisition unit 621, i.e., the value of the imaging parameter corresponding to the imaging device 9 that acquired the original image Im1, into the relational expressions represented by the formulae (1) to (3).
As shown in
In other words, the larger the observation distance Do, the larger the observation error ΔD.
Next, when D represents the actual distance (hereinafter referred to as a first distance) between the position of the imaging device 9 and the position of the positioning point 10e of the vehicle 10, the first distance D can be expressed by the following formula (2).
In other words, the first distance D is determined by the observation distance Do, the height H of the imaging device 9 as the imaging parameter, and the height h of the positioning point 10e of the vehicle 10 as the vehicle parameter.
Then, as shown in
Here, the estimated distance Dp can be calculated using the actual distance (hereinafter referred to as a third distance Dc) obtained from the fixed coordinate point Pf and the imaging coordinate point Pc, the local coordinate point P3, and the fixed coordinate point Pf. Therefore, the position transformation unit 624 can calculate a vehicle coordinate point Pv using the second distance Dt obtained by correcting the estimated distance Dp using the formula (3) shown above, and the fixed coordinate point Pf. The vehicle coordinate point Pv thus calculated is a coordinate point indicating the position of the vehicle 10 in the global coordinate system, and therefore corresponds to the position of the vehicle 10 in real space. The second transmission unit 625 transmits the vehicle coordinate point Pv to the remote control device 7.
In step S11, the server acquires vehicle location information using detection result output from an external sensor. The external sensor is located outside the vehicle 10. The vehicle location information is locational information as a basis for generating a running control signal. In the present embodiment, the vehicle location information includes the location and orientation of the vehicle 10 in a reference coordinate system of the factory. In the present embodiment, the reference coordinate system of the factory is a global coordinate system and a location in the factory can be expressed by X, Y, and Z coordinates in the global coordinate system. In the present embodiment, the external sensor is a camera that is disposed in the factory and outputs a captured image as detection result. In step S11, the server acquires the vehicle location information using the captured image acquired from the camera as the external sensor.
More specifically, in step S11, the server for example, determines the outer shape of the vehicle 10 from the captured image, calculates the coordinates of a positioning point of the vehicle 10 in a coordinate system of the captured image, namely, in a local coordinate system, and converts the calculated coordinates to coordinates in the global coordinate system, thereby acquiring the location of the vehicle 10. The outer shape of the vehicle 10 in the captured image may be detected by inputting the captured image to a detection model using artificial intelligence, for example.
Specifically, in the step S11 in the present embodiment, the detection method and the position calculation method shown in
In step S12, the server determines a target location to which the vehicle 10 is to move next. In the present embodiment, the target location is expressed by X, Y, and Z coordinates in the global coordinate system. The memory of the server contains a reference route stored in advance as a route along which the vehicle 10 is to run. The route is expressed by a node indicating a departure place, a node indicating a way point, a node indicating a destination, and a link connecting nodes to each other. The server determines the target location to which the vehicle 10 is to move next using the vehicle location information and the reference route. The server determines the target location on the reference route ahead of a current location of the vehicle 10.
In step S13, the server generates a running control signal for causing the vehicle 10 to run toward the determined target location. In the present embodiment, the running control signal includes an acceleration and a steering angle of the vehicle 10 as parameters. The server calculates a running speed of the vehicle 10 from transition of the location of the vehicle 10 and makes comparison between the calculated running speed and a target speed of the vehicle 10 determined in advance. If the running speed is lower than the target speed, the server generally determines an acceleration in such a manner as to accelerate the vehicle 10. If the running speed is higher than the target speed as, the server generally determines an acceleration in such a manner as to decelerate the vehicle 10. If the vehicle 10 is on the reference route, server determines a steering angle and an acceleration in such a manner as to prevent the vehicle 10 from deviating from the reference route. If the vehicle 10 is not on the reference route, in other words, if the vehicle 10 deviates from the reference route, the server determines a steering angle and an acceleration in such a manner as to return the vehicle 10 to the reference route. In other embodiments, the running control signal may include the speed of the vehicle 10 as a parameter instead of or in addition to the acceleration of the vehicle 10.
In step S14, the server transmits the generated running control signal to the vehicle 10. The server repeats the acquisition of vehicle location information, the determination of a target location, the generation of a running control signal, the transmission of the running control signal, and others in a predetermined cycle.
In step S15, the vehicle 10 receives the running control signal transmitted from the server. In step S16, the vehicle 10 controls an actuator of the vehicle 10 using the received running control signal, thereby causing the vehicle 10 to run at the acceleration and the steering angle indicated by the running control signal. The vehicle 10 repeats the reception of a running control signal and the control over the actuator in a predetermined cycle. According to the system 50 in the present embodiment, it becomes possible to move the vehicle 10 without using a transport unit such as a crane or a conveyor.
The information acquisition unit 721 of the remote control device 7 acquires running information including the vehicle coordinate point Pv (step S101). Then, the control value generation unit 722 generates a control value that specifies the running operation of the vehicle 10 using the running information (step S102). Then, the third transmission unit 723 transmits the control value to the vehicle 10 (step S103). The vehicle control device 150 mounted on the vehicle 10 drives the driving device 110 and the like according to the received control value (step S104). The steps S101 and S102 correspond to the process performed between the steps S11 and S13 in
According to the first embodiment described above, when the vehicle 10 is made to run automatically by remote control, the position calculation system 1 is capable of calculating the position of the vehicle 10 using the captured image including the vehicle 10. Specifically, the detection device 5 inputs the captured image to a detection model that is a trained machine learning model capable of detecting the vehicle 10 included in the captured image. This allows the detection device 5 to detect the vehicle 10 included in the captured image. It then allows the position calculation device 6 to calculate the vehicle coordinate point Pv indicating the position of the vehicle 10 included in the captured image, using the first mask image Im5 that is the detection result by the detection device 5.
Here, since the exterior shape of the vehicle 10 varies depending on the body type, the feature extracted when the detection model is trained varies depending on the body type. Therefore, if a detection model trained with only a plurality of training images each containing vehicles 10 classified into different body types using the same number of images for each body type is applied to all body types of the vehicle 10, the accuracy in the detection of the vehicle 10 may decrease for some body types. When the detection accuracy of some body types of the vehicle 10 decreases, it is possible to retrain the detection model by adding training images that include the vehicles 10 of the body types whose detection accuracy has decreased so as to improve the detection accuracy of the vehicles 10 of the body types whose detection accuracy has decreased. However, by training the detection model with more training images containing some body types of the vehicle 10 than those containing other body types of the vehicle 10, it is possible that the feature in the additional training images is trained more intensively than other features. This can cause another difference in detection accuracy between body types. If the accuracy in the detection of the vehicle 10 in the captured image decreases, the accuracy in the calculation of the vehicle coordinate point Pv indicating the position of the vehicle 10 decreases. If the accuracy in the calculation of the vehicle coordinate point Pv decreases, the desired control value may not be generated.
Therefore, in the first embodiment described above, the first detection model Md1 is trained with more first training images containing vehicles 10 classified into one body type than the second training images containing vehicles 10 classified into other body types. As a result, the first detection model Md1 suitable for the detection of the vehicles 10 classified into one body type is prepared for each one body type. Further, according to the first embodiment described above, the detection device 5 is capable of acquiring the type information indicating one body type of the vehicle 10 included in the captured image, and acquiring the first detection model Md1 selected according to the one body type identified based on the acquired type information from a plurality of first detection models Md1 prepared for the respective one body types. The detection device 5 then inputs the captured image to the first detection model Md1 according to the body type of the vehicle 10 included in the captured image. This allows the detection device 5 to accurately detect the vehicle 10 included in the captured image. That is, by acquiring the first detection model Md1 suitable for the detection of the vehicle 10 classified into one body type, the respective regions that constitute the captured image can be accurately classified into target regions and out-of-target regions. This can suppress a decrease in accuracy in the detection of the vehicle 10 included in the captured image depending on the body type of the vehicle 10. As a result, it is possible to suppress the decrease in accuracy in the calculation of the vehicle coordinate point Pv. This reduces the difference between the position of the vehicle 10 calculated by the position calculation device 6 and the actual position of the vehicle 10. Thus, more appropriate control values can be generated.
Further, according to the first embodiment described above, the first training data set includes M (M is an integer of 2 or more) first training images and N (N is an integer of 0 or more and less than M) second training images. The first training data set may include only the first training images with no second training images. In this case, for example, by increasing the number M of the first training images containing vehicles 10 classified into one body type, the detection accuracy of the first detection model Md1 can be improved. Further, the first training data set may include the first training images and N second training images that are less than the first training images. In this case, since the first training data set includes the second training images, it is possible to improve the detection accuracy of, for example, the parts of the vehicle 10 that have a common exterior shape regardless of the body type.
Further, according to the first embodiment described above, when the vehicle 10 included in the captured image is detected, the distortion of the original image Im1 can be corrected to generate the corrected image Im2. In this way, the accuracy in the detection of the vehicle 10 included in the captured image can be further improved. This improves the accuracy in the calculation of the position of the vehicle 10.
Further, according to the first embodiment described above, when the vehicle 10 included in the captured image is detected, the corrected image Im2 can be rotated so that the direction of the movement vector V of the vehicle 10 faces a predetermined direction to generate the rotated image Im3. In this way, the vehicle 10 included in the captured image can be detected with the direction of the movement vector V unified. In this way, the accuracy in the detection of the vehicle 10 included in the captured image can be further improved.
Further, according to the first embodiment described above, when the vehicle 10 included in the captured image is detected, the trimming to cut out the necessary regions including the vehicle 10 from the rotated image Im3 can be performed to generate the processed image Im4 in which unnecessary regions are eliminated. In this way, the vehicle 10 can be detected while eliminating other elements than the vehicle 10 subjected to the detection from the captured image. In this way, the accuracy in the detection of the vehicle 10 included in the captured image can be further improved.
Further, according to the first embodiment described above, when the vehicle 10 included in the captured image is detected, the trimming step to cut out the pre-movement region A1 including the vehicle 10 from the rotated image Im3 can be performed to generate the processed image Im4. In this way, the occupied region of the vehicle 10 in the captured image can be made larger than that in the case where the trimming step is not performed. This makes it easier to detect vehicles 10 that are more distant from the imaging device 9. Therefore, it is possible to improve the accuracy in the detection of vehicles 10 that are more distant from the imaging device 9.
Further, according to the first embodiment described above, by inputting the processed image Im4 to the first detection model Md1, the target region can be masked from among the regions that constitute the processed image Im4, thereby generating the first mask image Im5 in which the mask region Ms is added to the target region. Here, according to the first embodiment described above, DNN with the CNN structure capable of semantic segmentation and instance segmentation can be used as the algorithm for the first detection model Md1. This suppresses a decrease in accuracy in the detection of the vehicle 10 due to the diversity of the out-of-target regions in the captured image.
Further, according to the first embodiment described above, it is possible to generate the second mask image Im6 obtained by perspective transformation of the first mask image Im5. This allows for the transformation of the camera coordinate system to the local coordinate system.
Further, according to the first embodiment described above, by setting the first bounding rectangle R1 on the mask region Ms before the perspective transformation of the first mask image Im5, it is possible to calculate the base coordinate point P0, which is the vertex with the closest coordinates to the positioning point 10e of the vehicle 10 in the first bounding rectangle R1. Then, by performing perspective transformation of the first mask image Im5 after the calculation of the base coordinate point P0, the first coordinate point P1, which is the coordinate point corresponding to the base coordinate point P0, can be calculated. Furthermore, by setting the second bounding rectangle R2 on the mask region Ms of the second mask image Im6, it is possible to calculate the second coordinate point P2, which is the vertex with the closest coordinates to the positioning point 10e of the vehicle 10 in the second bounding rectangle R2. Then, by correcting the first coordinate point P1 using the second coordinate point P2, the local coordinate point P3 can be calculated. By thus comparing and correcting the coordinate points before and after the perspective transformation, the local coordinate point P3 can be calculated more accurately. This further improves the accuracy in the calculation of the position of the vehicle 10.
Further, according to the first embodiment described above, by inputting the captured image to the identification model Md3, which is a trained machine learning model, type information indicating one body type of the vehicle 10 included in the captured image can be acquired. That is, the type information regarding the vehicle 10 included in captured image can be acquired by machine learning.
Further, according to the first embodiment described above, the imaging parameter regarding the imaging device 9 that acquired the original image Im1 can be acquired. Then, by substituting the calculated local coordinate point P3 and the acquired value of the imaging parameter into the relational expression that includes the vehicle coordinate point Pv as the objective variable and the local coordinate point P3, the imaging parameter, and the vehicle parameter as explanatory variables, the local coordinate point P3 can be transformed to the vehicle coordinate point Pv. This allows the local coordinate system to be transformed to the global coordinate system, thereby calculating the position of the positioning point 10e of the vehicle 10 in the global coordinate system as the vehicle coordinate point Pv.
Further, according to the first embodiment described above, the imaging parameter is the height H of the imaging device 9 from the road surface 20, which is calculated based on the position of the imaging device 9 in the global coordinate system. Further, the vehicle parameter is the height h of the positioning point 10e of the vehicle 10 from the road surface 20. In this way, the observation error ΔD can be calculated from the similarity between the imaging parameter and the vehicle parameter. Then, the local coordinate point P3 can be transformed into the vehicle coordinate point Pv using the calculated observation error ΔD.
Further, according to the first embodiment described above, the position of the vehicle 10 can be calculated without installing, in the vehicle 10, sensors, markers, transmitters/receivers, and the like, which are used for position calculation of the vehicle 10. Further, the position of the vehicle 10 can be calculated without mounting the position calculation device 6 on the vehicle 10. For this reason, the versatility of the position calculation system 1 can be increased.
In the present embodiment, the storage unit 53a of the detection device 5a stores one second detection model Md2 as a detection model. Similarly to the first detection model Md1, the second detection model Md2 is a trained machine learning model used to detect the vehicle 10 included in the captured image. The second detection model Md2 is a machine learning model trained by inputting a second training data set. The second training data set has a plurality of training images each containing vehicles 10 classified into different body types, region correct answer labels respectively associated with the plurality of regions constituting the training images, and type correct answer labels respectively associated with the plurality of training images. As in the first embodiment, each region correct answer label is a correct answer label indicating whether each region in the training image is a target region indicating the vehicle 10 or an out-of-target region indicating other than the vehicle 10. The type correct answer label is a correct answer label indicating one body type of the vehicle 10 in the training image. In the present embodiment, the second detection model Md2 performs the following processes when a captured image is input together with one body type identified by the type information acquired by the type acquisition unit 525. In this case, the second detection model Md2 identifies a target region(s) in the input captured image. The second detection model Md2 then masks the target region to generate the first mask image Im5 in which the mask region Ms is added to the target region. For example, CNN is used as the algorithm of the second detection model Md2. The structure of the second detection model Md2 is not limited to those described above. The second detection model Md2 may be, for example, a trained machine learning model with an algorithm other than a neural network.
In the present embodiment, in the model acquisition step (step S6a), the model acquisition unit 526a acquires the second detection model Md2. Then, in the detection step (step S7a), the detection unit 527a inputs, to the second detection model Md2, the processed image Im4 and one body type identified by the type information acquired by the type acquisition unit 525. In this way, the detection unit 527a detects the outer shape of the vehicle 10 included in the processed image Im4. The detection unit 527a then masks the target region in the processed image Im4 to generate the first mask image Im5 in which the mask region Ms is added to the target region.
According to the second embodiment described above, the second detection model Md2 is prepared as a detection model that is trained by associating a plurality of training images that include the vehicles 10 and type correct answer labels that indicate the body types of the vehicles 10 in the respective training images. This allows for accurate detection of the vehicle 10 included in the input captured image by inputting the captured image associated with the body type to the second detection model Md2. That is, the respective regions that constitute the captured image input to the second detection model Md2 can be accurately classified into target regions and out-of-target regions. This can suppress a decrease in accuracy in the detection of the vehicle 10 included in the captured image depending on the body type of the vehicle 10. As a result, it is possible to suppress the decrease in accuracy in the calculation of the vehicle coordinate point Pv. This reduces the difference between the position of the vehicle 10 calculated by the position calculation device 6 and the actual position of the vehicle 10. Thus, more appropriate control values can be generated.
As shown in
The third embodiment described above is also capable of suppressing the decrease in accuracy in the detection of the vehicle 10 included in the captured image depending on the body type of the vehicle 10. This allows the vehicle 10 to generate more appropriate control values for causing the vehicle 10 to run by autonomous control. In the step S901 in the present embodiment, the method shown in
The one body type may indicate a single type of the vehicle 10. The type of the vehicle 10 refers to, for example, the type of the vehicle 10 classified by the name given to the vehicles 10 with the same or common exterior shape. In this case, the storage unit 53 of the detection device 5 stores at least one of a plurality of first detection models Md1 prepared for each one type of the vehicle 10 and a single second detection model Md2 that has been trained by associating one type of the vehicle 10 with a training image. The first training image here is an image that includes the vehicle 10 classified into one type. The second training image is an image that includes the vehicle 10 classified into another vehicle type different from the one type. The type acquisition unit 525 of the detection device 5, 5a acquires type information indicating the type of the vehicle 10 included in the captured image. The model acquisition unit 526 acquires the first detection model Md1 selected according to one type identified by the type information acquired by the type acquisition unit 525, from among the plurality of first detection models Md1 prepared for the respective one types. According to such an embodiment, it is possible to more accurately grasp the differences in exterior shape of the vehicle 10 that vary depending on the type of the vehicle 10. This can improve the accuracy in the detection of the vehicle 10 included in the captured image.
The type acquisition unit 525 may acquire type information indicating one body type of the vehicle 10 included in the captured image using schedule information stored in advance in the storage unit 53 of the detection device 5. The schedule information is information indicating the scheduled time when the vehicle 10 classified into one body type runs in the imaging range RG, and is associated with each imaging range RG of the plurality of imaging devices 9 that acquire the original image Im1. In such an embodiment, it is possible to acquire the type information indicating one body type of the vehicle 10 included in the captured image using the schedule information. This allows the user to acquire the type information regarding the vehicle 10 included in the captured image without performing image analysis. Therefore, the processing load on the type acquisition unit 525 in the type acquisition step can be reduced.
The training images included in the first and second training data sets may be unprocessed images, such as the original image Im1, which were not processed through the distortion correction process, the rotation process, the trimming, and the like. In such an embodiment, image processing is not required in the preparation of the first and second training data sets. This reduces the processing load during the learning of the first detection model Md1 and the second detection model Md2.
The training image may be either the corrected image Im2, the rotated image Im3, or the processed image Im4. Such an embodiment makes it possible to improve the accuracy in the detection of the vehicle 10 when the captured image obtained by processing the original image Im1 is input to one of the first detection model Md1 and the second detection model Md2, as in the first embodiment and the second embodiment described above.
In the vehicle 10 detection methods shown in
In the vehicle 10 detection methods shown in
In the vehicle 10 detection methods shown in
The captured image may include a plurality of vehicles 10. In this case, the CPU 52, 52a of the detection device 5, 5a may include a deletion unit that deletes, for example, the mask region Ms of the vehicle 10 that is not subjected to the position calculation, from the first mask image Im5. The deletion unit, for example, finds a mask region Ms that exists outside the target recognition region as the mask region Ms of the vehicle 10 not subjected to the position calculation, from among the mask regions Ms generated in the detection step, and deletes it from the first mask image Im5. The target recognition region is, for example, a predetermined region where the vehicle 10 moves in the first mask image Im5. The predetermined region where the vehicle 10 moves is, for example, the region corresponding to the region of the grid line 21. The target recognition region is stored in advance in the storage unit 53, 53a of the detection device 5, 5a. In such an embodiment, it is possible to eliminate the influence of the vehicle 10 that is not subjected to the position calculation when a plurality of vehicles 10 are captured in the captured image. This improves the accuracy in the calculation of the position of the vehicle 10.
The captured image may include a plurality of vehicles 10. In this case, a DNN that performs instance segmentation may be used as an algorithm for the first detection model Md1 and the second detection model Md2. In such an embodiment, the plurality of vehicles 10 included in the captured image can be classified, and the first mask image Im5 with masking for each vehicle 10 can be generated. This allows for selection of the vehicle 10 subjected to the position calculation when a plurality of vehicles 10 are included in the captured image, thereby allowing position calculation of the selected vehicle 10.
The position calculation device 6 may calculate the position of a stationary vehicle 10. When the position of a stationary vehicle 10 is calculated, the position calculation device 6 calculates the position of the vehicle 10 using, for example, instead of the direction of the movement vector V of the running vehicle 10, the initial vector direction of the vehicle 10 estimated from the captured image that is acquired first after the booting of the position calculation device 6. In such an embodiment, the position of the vehicle 10 can be calculated using the captured image even when the vehicle 10 is stopped.
At least some of the detection device 5, 5a, the position calculation device 6, and the remote control device 7 may be integrated into one unit. Furthermore, in an alternative embodiment, each unit of the detection device 5, the position calculation device 6, and the remote control device 7 may be implemented by, for example, cloud computing constituted of one or more computers. In such an embodiment, the structures of the detection device 5, 5a, the position calculation device 6, and the remote control device 7 can be changed as needed.
The image acquisition unit 521 of the detection device 5, 5a may acquire the original image Im1 via another external device (e.g., the remote control device 7) other than the imaging device 9, without directly acquiring the original image Im1 from the imaging device 9. Such an embodiment also allows the image acquisition unit 521 of the detection device 5, 5a to acquire the original image Im1 acquired by the imaging device 9.
In each of the above-described embodiments, the vehicle 10 is simply required to have a configuration to become movable by unmanned driving. The vehicle 10 may embodied as a platform having the following configuration, for example. The vehicle 10 is simply required to include at least actuators and a controller. More specifically, in order to fulfill three functions including “run,” “turn,” and “stop” by unmanned driving, the actuators may include a driving device, a steering device and a braking device. The actuators are controlled by the controller that controls running of the vehicle 10. In order for the vehicle 10 to acquire information from outside for unmanned driving, the vehicle 10 is simply required to include the communication device further. Specifically, the vehicle 10 to become movable by unmanned driving is not required to be equipped with at least some of interior components such as a driver's seat and a dashboard, is not required to be equipped with at least some of exterior components such as a bumper and a fender or is not required to be equipped with a bodyshell. In such cases, a remaining component such as a bodyshell may be mounted on the vehicle 10 before the vehicle 10 is shipped from a factory, or a remaining component such as a bodyshell may be mounted on the vehicle 10 after the vehicle 10 is shipped from a factory while the remaining component such as a bodyshell is not mounted on the vehicle 10. Each of components may be mounted on the vehicle 10 from any direction such as from above, from below, from the front, from the back, from the right, or from the left. Alternatively, these components may be mounted from the same direction or from respective different directions. The location determination for the platform may be performed in the same way as for the vehicle 10 in the first embodiments.
The detection device 5, 5a may detect the vehicle 10 using the first detection model Md1 that is stored in a device other than the detection device 5, 5a and selected by a device other than the detection device 5, 5a. Such an embodiment also enables acquisition of the first detection model Md1 according to the body type of the vehicle 10.
In the above-described first embodiment, the server performs the processing from acquisition of vehicle location information to generation of a running control signal. By contrast, the vehicle 10 may perform at least part of the processing from acquisition of vehicle location information to generation of a running control signal. For example, embodiments (1) to (3) described below are applicable, for example.
(1) The server may acquire vehicle location information, determine a target location to which the vehicle 10 is to move next, and generate a route from a current location of the vehicle 10 indicated by the acquired vehicle location information to the target location. The server may generate a route to the target location between the current location and a destination or generate a route to the destination. The server may transmit the generated route to the vehicle 10. The vehicle 10 may generate a running control signal in such a manner as to cause the vehicle 10 to run along the route received from the server and control an actuator using the generated running control signal.
(2) The server may acquire vehicle location information and transmit the acquired vehicle location information to the vehicle 10. The vehicle 10 may determine a target location to which the vehicle 10 is to move next, generate a route from a current location of the vehicle 10 indicated by the received vehicle location information to the target location, generate a running control signal in such a manner as to cause the vehicle 10 to run along the generated route, and control an actuator using the generated running control signal.
(3) In the foregoing embodiments (1) and (2), an internal sensor may be mounted on the vehicle 10, and detection result output from the internal sensor may be used in at least one of the generation of the route and the generation of the running control signal. The internal sensor is a sensor mounted on the vehicle 10. More specifically, the internal sensor might include a camera, LiDAR, a millimeter wave radar, an ultrasonic wave sensor, a GPS sensor, an acceleration sensor, and a gyroscopic sensor, for example. For example, in the foregoing embodiment (1), the server may acquire detection result from the internal sensor, and in generating the route, may reflect the detection result from the internal sensor in the route. In the foregoing embodiment (1), the vehicle 10 may acquire detection result from the internal sensor, and in generating the running control signal, may reflect the detection result from the internal sensor in the running control signal. In the foregoing embodiment (2), the vehicle 10 may acquire detection result from the internal sensor, and in generating the route, may reflect the detection result from the internal sensor in the route. In the foregoing embodiment (2), the vehicle 10 may acquire detection result from the internal sensor, and in generating the running control signal, may reflect the detection result from the internal sensor in the running control signal.
In the above-described embodiment in which the vehicle 10 can be running by autonomous control, the vehicle 10 may be equipped with an internal sensor, and detection result output from the internal sensor may be used in at least one of generation of a route and generation of a running control signal. For example, the vehicle 10 may acquire detection result from the internal sensor, and in generating the route, may reflect the detection result from the internal sensor in the route. The vehicle 10 may acquire detection result from the internal sensor, and in generating the running control signal, may reflect the detection result from the internal sensor in the running control signal.
In the above-described embodiment in which the vehicle 10 can be running by autonomous control, the vehicle 10 acquires vehicle location information using detection result from the external sensor. By contrast, the vehicle 10 may be equipped with an internal sensor, the vehicle 10 may acquire vehicle location information using detection result from the internal sensor, determine a target location to which the vehicle 10 is to move next, generate a route from a current location of the vehicle 10 indicated by the acquired vehicle location information to the target location, generate a running control signal for running along the generated route, and control an actuator of the vehicle 10 using the generated running control signal. In this case, the vehicle 10 is capable of running without using any detection result from an external sensor. The vehicle 10 may acquire target arrival time or traffic congestion information from outside the vehicle 10 and reflect the target arrival time or traffic congestion information in at least one of the route and the running control signal.
In the above-described first embodiment, the server automatically generates a running control signal to be transmitted to the vehicle 10. By contrast, the server may generate a running control signal to be transmitted to the vehicle 10 in response to operation by an external operator existing outside the vehicle 10. For example, the external operator may operate an operating device including a display on which a captured image output from the external sensor is displayed, steering, an accelerator pedal, and a brake pedal for operating the vehicle 10 remotely, and a communication device for making communication with the server through wire communication or wireless communication, for example, and the server may generate a running control signal responsive to the operation on the operating device.
In each of the above-described embodiments, the vehicle 10 is simply required to have a configuration to become movable by unmanned driving. The vehicle 10 may embodied as a platform having the following configuration, for example. The vehicle 10 is simply required to include at least actuators and a controller. More specifically, in order to fulfill three functions including “run,” “turn,” and “stop” by unmanned driving, the actuators may include a driving device, a steering device and a braking device. The actuators are controlled by the controller that controls running of the vehicle 10. In order for the vehicle 10 to acquire information from outside for unmanned driving, the vehicle 10 is simply required to include the communication device further. Specifically, the vehicle 10 to become movable by unmanned driving is not required to be equipped with at least some of interior components such as a driver's seat and a dashboard, is not required to be equipped with at least some of exterior components such as a bumper and a fender or is not required to be equipped with a bodyshell. In such cases, a remaining component such as a bodyshell may be mounted on the vehicle 10 before the vehicle 10 is shipped from a factory, or a remaining component such as a bodyshell may be mounted on the vehicle 10 after the vehicle 10 is shipped from a factory while the remaining component such as a bodyshell is not mounted on the vehicle 10. Each of components may be mounted on the vehicle 10 from any direction such as from above, from below, from the front, from the back, from the right, or from the left. Alternatively, these components may be mounted from the same direction or from respective different directions. The location determination for the platform may be performed in the same way as for the vehicle 10 in the first embodiments.
The vehicle 10 may be manufactured by combining a plurality of modules. The module means a unit composed of one or more components grouped according to a configuration or function of the vehicle 10. For example, a platform of the vehicle 10 may be manufactured by combining a front module, a center module and a rear module. The front module constitutes a front part of the platform, the center module constitutes a center part of the platform, and the rear module constitutes a rear part of the platform. The number of the modules constituting the platform is not limited to three but may be equal to or less than two, or equal to or greater than four. In addition to or instead of the platform, any parts of the vehicle 10 different from the platform may be modularized. Various modules may include an arbitrary exterior component such as a bumper or a grill, or an arbitrary interior component such as a seat or a console. Not only the vehicle 10 but also any types of moving object may be manufactured by combining a plurality of modules. Such a module may be manufactured by joining a plurality of components by welding or using a fixture, for example, or may be manufactured by forming at least part of the module integrally as a single component by casting. A process of forming at least part of a module as a single component is also called Giga-casting or Mega-casting. Giga-casting can form each part conventionally formed by joining multiple parts in a moving object as a single component. The front module, the center module, or the rear module described above may be manufactured using Giga-casting, for example.
A configuration for realizing running of a vehicle by unmanned driving is also called a “Remote Control auto Driving system”. Conveying a vehicle using Remote Control Auto Driving system is also called “self-running conveyance”. Producing the vehicle using self-running conveyance is also called “self-running production”. In self-running production, for example, at least part of the conveyance of vehicles is realized by self-running conveyance in a factory where the vehicle is manufactured.
In each of the above-described embodiments, some or all of functions and processes realized by software may be realized by hardware. Furthermore, some or all of functions and processes realized by hardware may be realized by software. For example, any type of circuit such as an integrated circuit or a discrete circuit may be used as hardware for realizing the functions described in each of the foregoing embodiments.
The present disclosure is not limited to the embodiments described above and is able to be implemented with various configurations without departing from the spirit thereof. For example, the technical features of any of the embodiment, the examples and the modifications corresponding to the technical features of each of the aspects described in SUMMARY may be replaced or combined appropriately, in order to solve part or all of the problems described above or in order to achieve part or all of the advantageous effects described above. When the technical features are not described as essential features in the present specification, they are able to be deleted as necessary.
Number | Date | Country | Kind |
---|---|---|---|
2023-099794 | Jun 2023 | JP | national |