This application is a U.S. National Phase of International Patent Application No. PCT/JP2020/039509 filed on Oct. 21, 2020, which claims priority benefit of Japanese Patent Application No. JP 2019-214975 filed in the Japan Patent Office on Nov. 28, 2019. Each of the above-referenced applications is hereby incorporated herein by reference in its entirety.
The present disclosure relates to an image processing device, an image processing method, and a program. More specifically, the present disclosure relates to an image processing device, an image processing method, and a program capable of, for example, accurately displaying a movement trajectory of a moving device such as a drone on an actually captured image.
In recent years, there has been a rapid increase in the use of drones that are small flight vehicles. For example, drones are provided with a camera and are used for processing of capturing an image of a landscape on the ground from above or other kinds of processing. Further, delivery of packages using drones is also planned, and various experiments have been carried out therefor.
At present, in many countries, it is required to perform flight control of drones by operating a controller under human monitoring, that is, within human sight. However, in the future, many autonomous flying drones that do not require human monitoring, that is, many drones that autonomously fly from a departure point to a destination will be used.
Such autonomous flying drones fly from a departure point to a destination by using, for example, communication information with a control center or GPS position information.
A specific use form of autonomous flying drones is delivery of packages by drones. In a case where a package is delivered by a drone, it is expected that, when the estimated time of arrival of the drone carrying the package addressed to a user who has requested delivery of the package approaches, the user wants to look up the sky to check the drone carrying the package addressed to the user and also check a flight path or scheduled flight path thereof.
Further, it is also expected that, even in a case where a drone is not a drone for delivering packages, there is a demand to check a flight path of the drone flying in the sky for, for example, safety confirmation.
Processing for satisfying such a user request is, for example, processing of capturing an image of the drone in the sky by using a camera of a camera-equipped user terminal such as a smartphone and displaying the captured image on a display unit while superimposing and displaying a flight path and scheduled flight path of the drone on the actually captured image of the drone.
That is, an augmented reality (AR) image in which a line indicating the flight path is superimposed on a real image of the drone is generated and displayed.
Information regarding the flight path and scheduled flight path of the drone can be transmitted from the drone or a control center that manages flight of the drone to the user terminal such as a smartphone by communication via a communication network.
The drone or the control center holds the information regarding the flight path and the scheduled flight path of the drone, that is, flight path information and can provide the flight path information for the user terminal such as a smartphone.
However, many drones perform position control using communication information of a GPS satellite. Position information obtained from the GPS satellite includes latitude information, longitude information, and altitude information. Many drones fly by using the above information and therefore perform position confirmation and flight control according to the NED coordinate system.
The NED coordinate system is a coordinate system in which north, east, and down are set as three axes.
The drone or the control center holds the flight path information that is the information regarding the flight path and scheduled flight path of the drone as path information (N, E, D) to which the NED coordinates are applied, and the path information according to the NED coordinate system is provided for the user terminal such as a smartphone.
Meanwhile, a camera-capturing image displayed on the user terminal such as a smartphone is image data according to the camera coordinate system set in accordance with an imaging direction of the camera.
An image position of the real image of the drone captured by the camera of the user terminal such as a smartphone can be specified as an image position on the camera coordinates. However, it is difficult to calculate which position on the NED coordinates the real image position of the drone corresponds to.
As described above, the position of the drone serving as the real image displayed on the user terminal such as a smartphone can be specified in the camera coordinate system, but the flight path information of the drone received from the drone or the control center is path position information specified in the NED coordinate system. Thus, there is a problem in that it is difficult to confirm which position on the camera coordinates this path position corresponds to.
As a result, in a case where the user terminal such as a smartphone attempts to receive the flight path information of the drone from the drone or the control center and display the flight path on the display unit on the basis of the received information, there arises a problem that an accurate path cannot be displayed.
Note that, for example, Patent Document 1 (Japanese Patent No. 5192598) is a related art that discloses a configuration in which a position and trajectory of an autonomous robot are AR displayed on an image captured by a fixed point camera such as a surveillance camera.
The disclosed configuration is such that an AR tag is attached to the autonomous robot or a work area of the autonomous robot, the AR tag is recognized from a camera-capturing image to generate one piece of reference coordinate information, and the generated reference coordinate information is used to identify a position and a path of the autonomous robot.
However, the configuration requires to attach the AR tag in the work area and, in addition, is only applicable within a limited work area. In a configuration that flies in the sky like a drone, an AR tag cannot be attached in the sky.
The present disclosure has been made in view of the above problems, for example, and an object thereof is to provide an image processing device, an image processing method, and a program capable of displaying a real image of a drone serving as a camera-capturing image on a user terminal such as a smartphone and accurately superimposing and displaying a flight path and scheduled flight path of the drone on the real image.
A first aspect of the present disclosure is
Further, a second aspect of the present disclosure is
Furthermore, a third aspect of the present disclosure is
Note that the program of the present disclosure is, for example, a program that can be provided in a computer-readable format by a storage medium or a communication medium for an information processing device or computer system that can execute various program codes. By providing such a program in a computer-readable format, processing according to the program is realized in the information processing device or computer system.
Other objects, features, and advantages of the present disclosure will be apparent from more detailed description based on embodiments of the present disclosure described later and the accompanying drawings. Note that, in this specification, a system is a logical set configuration of a plurality of devices, and is not limited to a system in which devices having respective configurations are in the same housing.
An embodiment of the present disclosure realizes a configuration capable of accurately displaying a flight path of a drone on an actually captured image of the drone.
Specifically, for example, the configuration includes a data processing unit that displays a moving path of a moving device such as a drone on a display unit that displays a camera-capturing image of the moving device. The data processing unit generates a coordinate conversion matrix for performing coordinate conversion processing of converting position information according to a first coordinate system, for example, the NED coordinate system indicating the moving path of the moving device into a second coordinate system, for example, the camera coordinate system capable of specifying a pixel position of a display image on the display unit and outputs, to the display unit, the moving path having position information according to the camera coordinate system generated by coordinate conversion processing to which the generated coordinate conversion matrix is applied.
This configuration can accurately display a flight path of a drone on an actually captured image of the drone.
Note that the effects described in this specification are merely examples, are not limited, and may have other additional effects.
Hereinafter, details of an image processing device, an image processing method, and a program of the present disclosure will be described with reference to the drawings. Note that description will be made according to the following items.
First, a problem in processing of displaying path information of a drone will be described with reference to
As described above, at present, in many countries, it is required to perform flight control of drones by operating a controller under human monitoring, that is, within human sight. However, in the future, autonomous flying drones that do not require human monitoring, that is, drones that autonomously fly from a departure point to a destination will be used. Such autonomous flying drones fly from a departure point to a destination by using, for example, communication information with a control center or GPS position information.
A specific use form of autonomous flying drones is delivery of packages by drones. In a case where a package is delivered by a drone, it is expected that, when the estimated time of arrival of the drone carrying the package addressed to a user who has requested delivery of the package approaches, the user wants to look up the sky to check the drone carrying the package addressed to the user and also check a flight path or scheduled flight path thereof.
Processing for satisfying such a user request is, for example, processing in which the user captures an image of the drone in the sky by using a camera of a camera-equipped user terminal such as a smartphone, and the captured image is displayed on a display unit while a flight path and scheduled flight path of the drone is being superimposed and displayed on the image.
A specific example of this processing will be described with reference to
The user 1 directs the camera of the user terminal 10 toward the drone 20 in the sky and captures an image of the drone 20.
The image illustrated in
Lines indicating a flight path and scheduled flight path of the drone 20 are displayed on this real image as virtual images generated by a data processing unit of the user terminal 10.
That is, an augmented reality (AR) image in which a virtual line indicating the flight path is superimposed on the real image of the drone is generated and displayed.
The “flight path” in
It can be seen that the drone 20 plans to fly along the “scheduled flight path” from the current position, land in front of a house, and deliver a package addressed to the user.
The user terminal can receive information regarding the flight path and scheduled flight path of the drone from the drone or a control center.
As illustrated in
However, as described above, many drones perform position control using communication information of a GPS satellite. Position information obtained from the GPS satellite includes latitude information, longitude information, and altitude information. Many drones fly by using the above information and therefore perform position confirmation and control of a flight route by using the NED coordinate system in many cases.
The NED coordinates are coordinates in which north, east, and down are set as three axes.
The drone 20 or the drone management server 30 that is the control center illustrated in
Meanwhile, a camera-capturing image displayed on the user terminal 10 such as a smartphone is image data according to the camera coordinate system set in accordance with an imaging direction of the camera.
The data processing unit of the user terminal 10 such as a smartphone can specify an image position of the real image of the drone 20 captured by the camera as an image position on the camera coordinates. However, it is impossible to calculate which position on the NED coordinates the real image position of the drone 20 corresponds to.
As described above, the position of the drone 20 serving as the real image displayed on the user terminal 10 such as a smartphone can be specified in the camera coordinate system, but the flight path information of the drone 20 received from the drone 20 or the drone management server 30 is path position information specified in the NED coordinate system. Thus, it is impossible to accurately analyze which position on the camera coordinates this path position corresponds to.
As a result, in a case where the user terminal 10 such as a smartphone attempts to receive the flight path information of the drone 20 from the drone 20 or the drone management server 30 and display the flight path on the display unit on the basis of this received information, an accurate path cannot be displayed.
For example, as illustrated in
This is because, as illustrated in
As a result, as illustrated in
[2. Processing Executed by Image Processing Device of Present Disclosure]
Next, processing executed by the image processing device of the present disclosure will be described.
The image processing device of the present disclosure is, for example, the user terminal 10 such as a smartphone owned by the user and executes processing of accurately displaying path information of a drone according to the NED coordinate system (N, E, D) on a camera-capturing image according to the camera coordinate system (Xc, Yc, Zc), for example, a captured image of the drone.
The user terminal 10 serving as the image processing device of the present disclosure converts the flight path information of the drone 20 received from the drone 20 or the drone management server 30, that is, a flight path position in the NED coordinate system into position information in the camera coordinate system that is a coordinate system of the camera-capturing image. Thereafter, the user terminal displays, on the camera-capturing image, a line indicating the flight path converted into the position information in the camera coordinate system.
By performing the above processing, it is possible to accurately display the path information of the drone on the captured image of the drone.
First, a plurality of coordinate systems used in the processing executed by the image processing device of the present disclosure will be described with reference to
Many drones 20 perform position control using communication information of a GPS satellite. Position information obtained from the GPS satellite is latitude information, longitude information, and altitude information, and many drones fly by using those pieces of information and thus use the NED coordinate system.
The NED coordinate system is coordinates in which north, east, and down are set as three axes.
The drone 20 or the drone management server 30 such as the control center holds the flight path information that is the information regarding the flight path and scheduled flight path of the drone 20 as the path information (N, E, D) in the NED coordinate system, and the path information according to the NED coordinates is provided for the user terminal 10 such as a smartphone.
In the processing of the present disclosure, the world coordinate system is a coordinate system (SLAM coordinate system) applied to SLAM processing executed by the user terminal 10, that is, simultaneous localization and mapping (SLAM) processing in which localization of a camera position and creation of an environment map (mapping) are executed in parallel.
As described in a lower center part of
Therefore, if a position on the flight path indicated in the NED coordinate system can be converted into a position indicated in the camera coordinate system, the path can be accurately output to the display image shown in the camera coordinate system.
The example of
that is necessary for converting certain position information (Xw, Yw, Zw) in the world coordinate system into position information (Xc, Yc, Zc) in the camera coordinate system.
For one point (X) in a three-dimensional space in an upper center of
The position in the world coordinate system (SLAM coordinate system): WsPx
The position in the camera coordinate system: CPX
Here, a coordinate conversion matrix for converting the position (WsPX) in the world coordinate system (SLAM coordinate system) for the one point (x) in the three-dimensional space into the position (CPX) in the camera coordinate system is defined as
As shown in a lower part of
CPX=CTWs×WsPX (Expression 1)
Here, the coordinate conversion matrix for converting the position (WsPX) in the world coordinate system (SLAM coordinate system) into the position (CPX) in the camera coordinate system, that is,
can be expressed by the following determinant (Expression 2).
Note that, in (Expression 2) above,
Note that the camera position corresponds to the position of the camera of the user terminal 10 in this embodiment.
Here, the coordinate conversion matrix CTWs in (Expression 2) above is a coordinate conversion matrix for converting the position (WsPX) in the world coordinate system (SLAM coordinate system) into the position (CPX) in the camera coordinate system.
As described above with reference to
Note that each of the three coordinate conversion matrices can be calculated from the other two coordinate conversion matrices. For example,
Note that CTWs−1 denotes an inverse matrix of CTWs and can be calculated from CTWs.
Similarly to the above case,
Further, the coordinate conversion matrix CTNED for converting the position (NEDPX) in the NED coordinate system into the position (CPX) in the camera coordinate system can be calculated according to the following expression by using the other two coordinate conversion matrices (WsTNED, CTWs).
CTNED=CTWs×WsTNED
As described above with reference to
That is, if the coordinate conversion matrix for converting the position (NEDPX) in the NED coordinate system into the position (CPX) in the camera coordinate system, that is,
can be calculated, the flight path position indicated in the NED coordinate system can be converted into the position in the camera coordinate system. This makes it possible to accurately output the flight path of the drone to the display image shown in the camera coordinate system.
The user terminal 10 serving as the image processing device of the present disclosure calculates the coordinate conversion matrix: CTNED.
A specific example of processing of calculating the coordinate conversion matrix: CTNED executed by the user terminal 10 serving as the image processing device of the present disclosure will be described with reference to
As illustrated in
In the example of
The data processing unit of the user terminal 10 records captured image positions of the drone at at least three different positions on a memory.
As illustrated in
Note that the drone imaging positions correspond to display image positions displayed on the display unit, and, here, a processing example using coordinate positions (u1, v1) to (u3, v3) on the display image is shown.
By using those three different drone positions and information regarding the three drone imaging positions corresponding to the three drone positions, it is possible to calculate the coordinate conversion matrix: CTNED.
Prior to specific description of the processing of calculating the coordinate conversion matrix (CTNED), a pinhole camera model will be described with reference to
In the pinhole camera model, the relational expression between the three-dimensional position M of the object serving as an imaging subject and the imaging position (imaging pixel position) m of the object by the camera is shown by (Expression 3) below.
[Math. 2]
λ{tilde over (m)}=ARw(M−Cw) (Expression 3)
The meaning of (Expression 3) above will be described with reference to
As illustrated in
(Expression 3) above shows a correspondence between a pixel position in a camera-capturing image plane for the point (m) of the object image 62 included in the image captured by the camera, that is, a position expressed in the camera coordinate system and the three-dimensional position (M) of the object 61 in the world coordinate system.
The position (pixel position) of the point (m) of the object image 62 included in the camera-capturing image is expressed in the camera coordinate system. The camera coordinate system is a coordinate system in which a focal point of the camera serves as an origin C, an image plane is a two-dimensional plane of Xc and Yc, and an optical axis direction (depth) is Zc. The origin C moves as the camera moves.
Meanwhile, the three-dimensional position (M) of the object 61 serving as the imaging subject is indicated in the world coordinate system having three axes Xw, Yw, and Zw and having an origin O that does not move even if the camera moves. An expression showing a correspondence between the positions of the object in the different coordinate systems is defined as the pinhole camera model in (Expression 3) above.
As illustrated in
Further,
The parameter λ is the normalization parameter, and is a value for satisfying a third term in
{tilde over (m)} [Math. 4]
Note that the camera intrinsic parameter A is the following determinant as shown in
The camera intrinsic parameter A includes the following values.
The following parameters:
The SLAM processing is processing of capturing images (moving image) by using a camera and analyzing trajectories of feature points included in the plurality of captured images, thereby estimating three-dimensional positions of the feature points and also estimating (localizing) a position and posture of the camera (self), and can create a surrounding map (environment map) (mapping) by using three-dimensional position information of the feature points. The processing of executing localization of the position of the camera (self) and creation of the surrounding map (environment map) (mapping) in parallel as described above is referred to as SLAM.
Note that one of the SLAM methods is EKF-based SLAM using an extended Kalman filter (EKF).
The EKF-based SLAM is a method of, for example, continuously capturing images while moving a camera, obtaining trajectories (tracking information) of feature points included in each image, and simultaneously estimating an amount of movement of the camera and three-dimensional positions of the feature points by a moving stereo method.
The EKF-based SLAM processing uses, for example, “state data” including multidimensional normal distribution data as a probability distribution model including the following pieces of information:
The “state data” includes multidimensional normal distribution data including an average vector and a variance-covariance matrix indicating the position, posture, velocity, and angular velocity of the camera and the position information of each feature point. The variance-covariance matrix includes [variance] of the position, posture, velocity, and angular velocity of the camera, the position information of each feature point, and eigenstate values thereof and [covariance] corresponding to correlation information regarding combinations of different state values of each of the above state values.
Among the following parameters included in (Expression 3) above, that is,
By using those parameters, it is possible to generate the relational expression between the three-dimensional position M of the object serving as the imaging subject and the imaging position (imaging pixel position) m of the object by the camera, that is, (Expression 3) above. Therefore, it is possible to analyze the correspondence between the three-dimensional position M of the object serving as the imaging subject indicated in the world coordinate system and the object imaging position indicated in the camera coordinate system.
(Expression 3) above shows a positional relationship between
Specifically, for example, the relational expression can also be developed as an expression showing a positional relationship between
The relational expression in this case, that is, the relational expression showing the positional relationship between
(Expression 4) above corresponds to an expression in which the following parameters for the world coordinate system in (Expression 3) above, that is,
That is, the expression is obtained by changing the above parameters to the following parameters for the NED coordinate system:
The relational expression in (Expression 4) is an expression defining a correspondence between the object position in the NED coordinate system and the object imaging position in the camera coordinate system that is the object imaging position in the imaging element when the object is imaged by the camera.
By using this relational expression, it is possible to calculate the coordinate conversion matrix
CTNED
for converting the position (NEDPX) in the NED coordinate system into the position (CPX) in the camera coordinate system.
As described above with reference to
A specific example of the processing of calculating the coordinate conversion matrix CTNED for converting the position (NEDPX) in the NED coordinate system into the position (CPX) in the camera coordinate system will be described.
In the example of
The data processing unit of the user terminal 10 records captured image positions of the drone at at least three different positions on the memory.
As illustrated in
The drone positions in the NED coordinate system at the times (t1), (t2), and (t3) are indicated as follows:
Further, the imaging positions in the camera coordinate system at the times (t1), (t2), and (t3) are indicated as follows:
Note that ({tilde over ( )}) above m is omitted in the above description. Those drone imaging positions are position information in the camera coordinate system indicated in a three-dimensional homogeneous coordinate system.
When (Expression 4) above, that is, (Expression 4) defining the correspondence between the object position in the NED coordinate system and the object imaging position in the camera coordinate system that is the object imaging position in the imaging element when the object is imaged by the camera is shown by using the following parameters:
(Expression 4) can be expressed by (Expression 5) below.
[Math. 7]
λ{tilde over (m)}Drone=A CRNED(NEDPDrone−NEDPC) (Expression 5)
Further, (Expression 6) below is derived on the basis of (Expression 5) above.
[Math. 8]
NEDPDrone−NEDPC=λ·CRNEDT·A−1·{tilde over (m)}Drone (Expression 6)
Note that
When the three different drone positions in the NED coordinate system at the times (t1) to (t3) in
[Math. 9]
NEDPDronet3−NEDPC=λt3·CRNEDT·A−1·{tilde over (m)}Dronet3
NEDPDronet2−NEDPC=λt2·CRNEDT·A−1·{tilde over (m)}Dronet2
NEDPDronet1−NEDPC=λt1·CRNEDT·A−1·{tilde over (m)}Dronet1 Expression 7)
In the simultaneous equations in (Expression 7) above, each parameter below is known.
The drone position in the NED coordinate system: NEDPDrone can be acquired from the drone or the drone management server.
The inverse matrix A−1 of the camera intrinsic parameter A is known.
The drone imaging positions at the times (t1) to (t3)=mDronet1 to mDronet3 are coordinate position information of a camera imaging system and can be acquired by analyzing images captured by the camera.
Therefore, unknown parameters in the simultaneous equations in (Expression 7) above are the following parameters:
Here, the unknown parameters in the simultaneous equations in (Expression 7) above are the following nine parameters (three position elements, three posture elements, and three normalization coefficients):
It is possible to calculate values of those parameters by solving the simultaneous equations including the three expressions (amount of information is nine).
By using the values of the calculated parameters, as illustrated in
The coordinate conversion matrix (CTNED) in
That is, the coordinate conversion matrix (CTNED) for converting the position (NEDPX) in the NED coordinate system into the position (CPX) in the camera coordinate system can be shown by (Expression 8) below.
Matrix elements of the coordinate conversion matrix (CTNED) in (Expression 8) above are formed by the parameters obtained by solving the simultaneous equations in (Expression 7) above.
Therefore, by solving the simultaneous equations in (Expression 7) above, it is possible to calculate the coordinate conversion matrix (CTNED) for converting the position (NEDPX) in the NED coordinate system into the position (CPX) in the camera coordinate system.
As described above, the user terminal 10 serving as the image processing device of the present disclosure first acquires the three different drone positions in the NED coordinate system at the times (t1) to (t3) in
Next, the following unknown parameters are acquired by solving the simultaneous equations in (Expression 7) above:
Next, the calculated parameters are used to generate the coordinate conversion matrix (CTNED), that is, the coordinate conversion matrix (CTNED) for converting the position (NEDPX) in the NED coordinate system into the position (CPX) in the camera coordinate system.
By using the coordinate conversion matrix (CTNED), it is possible to convert a position on the flight path indicated in the NED coordinate system into a position indicated in the camera coordinate system.
As described above with reference to
The user terminal 10 can apply the coordinate conversion matrix CTNED in (Expression 8) above to the acquired flight path information indicated in the NED coordinate system, thereby acquiring the flight path position indicated in the camera coordinate system, and outputs the acquired flight path position (flight path or scheduled flight path) indicated in the camera coordinate system to an actually captured image, that is, an image including the real image of the drone.
As a result, as illustrated in
An image display example of
Note that the above-described processing is based on the assumption that the position and posture of the camera are unchanged during a period of imaging the drone at the three different positions, that is, during a drone imaging time from the times (t1) to (t3) in
In a case where the position and posture of the camera are changed during the period of imaging the drone at the three different positions, the processing needs to be performed in consideration of the change in the position and posture of the camera.
Hereinafter, this processing example will be described with reference to
In a case where the position and posture of the camera change during the period of imaging the drone, as illustrated in
In the example of
Here, a coordinate conversion matrix for converting the world coordinate system (SLAM coordinate system) into the camera coordinate system at the time (t1) is defined as (Ct1TWs).
Further, a coordinate conversion matrix for converting the world coordinate system (SLAM coordinate system) into the camera coordinate system at the time (t2) is defined as (Ct2TWs).
Note that a coordinate conversion matrix (CtnTWs) for converting the world coordinate system (SLAM coordinate system) into the camera coordinate system at a time (tn) is a matrix corresponding in time (n) to the coordinate conversion matrix (CTWs) for converting the position (WsPx) in the world coordinate system (SLAM coordinate system) for the one point (x) in the three-dimensional space into the position (CPX) in the camera coordinate system described above with reference to
Matrix elements included in the coordinate conversion matrix (CtnTWs) for converting the world coordinate system into the camera coordinate system at the time (tn) can be acquired in the SLAM processing executed by the user terminal 10, that is, the simultaneous localization and mapping (SLAM) processing in which localization of a camera position and creation of an environment map (mapping) are executed in parallel.
Therefore, the coordinate conversion matrix (CtnTWs) at the time (tn) such as the coordinate conversion matrix (Ct1TWs) for converting the world coordinate system (SLAM coordinate system) into the camera coordinate system at the time (t1), and the coordinate conversion matrix (Ct2TWs) for converting the world coordinate system (SLAM coordinate system) into the camera coordinate system at the time (t2) shown in
Further, a coordinate conversion matrix for converting the camera coordinate system at the time (t1) into the camera coordinate system at the time (t2) is (Ct2TCt1) and can be calculated from the following expression.
Ct2TCt1=Ct2TWs×Ct1TWs−1
The user terminal 10 serving as the image processing device of the present disclosure performs coordinate conversion in which the above coordinate conversion matrix: Ct2TCt1 is applied to the drone imaging position on the imaging surface of the camera at the time (t1) at which the drone at the time (t1) is imaged. By this coordinate conversion, the drone imaging position in the camera coordinate system at the time (t1) is converted into a drone imaging position in the camera coordinate system at the time (t2).
Further, the position of the drone to be converted into the imaging surface of the camera at the time (t1),
As a result, the drone imaging positions on the two different camera coordinate systems can be converted into drone imaging positions according to one common camera coordinate system.
Regarding drone imaging positions corresponding to three different drone positions, it is possible to set drone imaging positions corresponding to the three different drone positions on one common camera coordinate system by performing the above processing.
As illustrated in
The flight path and scheduled flight path of the drone 20 are output at the latest time (t3). In this case, the data processing unit of the user terminal 10 executes the following processing:
By the above coordinate conversion processing, the drone imaging positions in the camera coordinate systems at the times (t1) and (t2) are converted into drone imaging positions in the camera coordinate system at the time (t3).
By establishing those equations, it is possible to establish simultaneous equations for setting drone imaging positions corresponding to three different drone positions on one common camera coordinate system (the camera coordinate system at the time (t3)).
That is, it is possible to establish simultaneous equations for setting the following three drone imaging positions on one common camera coordinate system (the camera coordinate system at the time (t3)):
Note that ({tilde over ( )}) above m is omitted in the above description. Those drone imaging positions are position information in the camera coordinate system indicated in a three-dimensional homogeneous coordinate system.
Thereafter, processing similar to the processing described above with reference to
First, as shown in
Next, the parameters obtained by solving the simultaneous equations are used to calculate the coordinate conversion matrix (CTNED) in
By using the coordinate conversion matrix (CTNED), the position on the flight path indicated in the NED coordinate system is converted into the position indicated in the camera coordinate system, and the acquired flight path position (flight path or scheduled flight path) indicated in the camera coordinate system is output to an actually captured image, that is, an image including the real image of the drone.
Note that, in the example of
Note that, thereafter, in a case where the flight path position (flight path or scheduled flight path) is continuously output at times (t4), (t5), . . . , it is only necessary to continuously execute the above-described processing.
In a case where the flight path position (flight path or scheduled flight path) is output to a camera-capturing image at the time (t4), simultaneous equations including correspondence equations between the drone imaging positions=mDronet2 to mDronet4 at the times (t2), (t3), and (t4) and the drone positions in the NED coordinate system, that is, the simultaneous equations in (Expression 7) above are generated.
In a case where the flight path position (flight path or scheduled flight path) is output to a camera-capturing image at the time (t5), simultaneous equations including correspondence equations between the drone imaging positions=m Dronet3 to mDronet5 at the times (t3), (t4), and (t5) and the drone positions in the NED coordinate system, that is, the simultaneous equations in (Expression 7) above are generated.
Hereinafter, if similar processing is continuously executed, the flight path position (flight path or scheduled flight path) on the camera-capturing image also moves as the camera is moved. Therefore, an accurate flight path is continuously output.
[3. Sequence of Processing Executed by Image Processing Device of Present Disclosure]
Next, a sequence of processing executed by the image processing device of the present disclosure will be described.
Flowcharts in
Processing according to the flow shown in
Hereinafter, processes in the respective steps in the flow shown in
Note that processes in steps S111 to S114 and processes in steps S121 to S123 in
First, the processes in steps S111 to S114 will be described.
(Step S111)
The process in step S111 is a process of capturing an image of a drone in the sky by using the user terminal 10.
The image of the flying drone 20 is captured at the time t(n) by using, for example, the camera of the user terminal 10 such as a smartphone.
That is, the image of the flying drone 20 is captured as described above with reference to
As illustrated in
(Step S112)
Next, in step S112, the user terminal 10 acquires drone imaging position information (imaging position information (mDronet(n)) in the camera coordinate system) in the captured image at the time t(n). Note that ({tilde over ( )}) above m is omitted in the description.
The drone imaging position is an imaging position indicated in the camera coordinate system (homogeneous coordinate system) at the time t(n).
(Step S113)
Next, in step S113, the user terminal 10 acquires position information of the drone (position information (NEDPDronet(n)) in the NED coordinate system) at the time t(n).
As described above with reference to
(Step S114)
Next, in step S114, the user terminal 10 records the imaging position information (imaging position information (mDronet(n)) in the camera coordinate system) and the position information (position information (NEDPDronet(n)) in the NED coordinate system) of the drone at the time t(n) on the memory in association with the time t(n).
Next, the processes in steps S121 to S123 executed in parallel with the processes in steps S111 to S114 will be described.
(Step S121)
The user terminal 10 executes the following process in step S121.
The user terminal executes the SLAM processing at the time t(n), that is, at the timing of capturing the image of the drone in step S111.
As described above, the SLAM processing is processing in which localization of a camera position and creation of an environment map (mapping) are executed in parallel.
(Step S122)
Next, in step S122, the user terminal 10 calculates a coordinate conversion matrix (Ct(n)TWs) for converting the world coordinate system (SLAM coordinate system) into the camera coordinate system at the imaging time t(n) on the basis of the SLAM processing result in step S121.
The process in step S122 corresponds to the processing described above with reference to
(Step S123)
Next, in step S123, the user terminal 10 records, on the memory, the coordinate conversion matrix (Ct(n)TWs) calculated in step S122, that is, the coordinate conversion matrix (Ct(n)TWs) for converting the world coordinate system (SLAM coordinate system) into the camera coordinate system at the imaging time t(n).
When the processes in steps S111 to S114 and the processes in steps S121 to S123 are completed, a process in step S124 is executed.
(Step S124)
In step S124, the user terminal 10 determines whether or not there are three or more entries recorded on the memory.
That is, it is determined whether or not data based on captured images at three different drone positions is recorded on the memory.
An example of specific recorded data recorded on the memory will be described with reference to
As shown in
In each entry corresponding to the imaging time in the memory, the above data corresponding to the drone imaging time is recorded.
In step S124 in the flowchart of
That is, it is determined whether or not data based on captured images at three different drone positions is recorded on the memory.
In a case where the data based on the captured images at the three different drone positions is recorded on the memory as shown in
Meanwhile, in a case where the data based on the captured images at the three different drone positions is not recorded on the memory, the determination in step S124 is No, and the processing proceeds to step S125.
(Step S125)
In step S125, a time setting parameter n is set to the next time n+1, and the processes in steps S111 to S114 and the processes in steps S121 to S123 are executed at the next time (n+1).
That is, the drone located at a position different from the position at the time (n) is imaged at the time (n+1), and the processing is executed.
(Step S131)
In step S124, in a case where it is determined that the data based on the captured images at the three different drone positions is recorded on the memory as shown in
As shown in
A coordinate conversion matrix (e.g., Ct(out)TCt(n)) for converting the camera coordinate system at the drone imaging time into the camera coordinate system at a time t(out) at which the drone flight path is output is calculated and is recorded on the memory.
This process corresponds to the processing described above with reference to
An example of the coordinate conversion matrix (e.g., Ct(out)TCt(n)) recorded on the memory will be described with reference to
Data recorded on the memory in step S131 is data of (5) shown in
Note that, in the example of
That is,
t(n+2)=t(out)
In this case, as shown in
Regarding the data at the imaging time=t(n+2), no coordinate conversion matrix needs to be additionally recorded because the camera coordinate system at the time of imaging the drone matches with the camera coordinate system at the time of outputting the flight path.
For the entry at the imaging time=t(n),
Further, for the entry at the imaging time=t(n+1),
As shown in the flow of
The user terminal performs coordinate conversion processing to which a coordinate conversion matrix (CtcTCtn) is applied on the drone imaging position in the camera coordinate system at the drone imaging time, calculates a drone imaging position corresponding to the camera coordinate system at the drone flight path output time t(out), and records the drone imaging position on the memory.
This process corresponds to the processing described above with reference to
An example of the drone imaging position corresponding to the camera coordinate system at the drone flight path output time t(out) recorded on the memory will be described with reference to
Data recorded on the memory in step S132 is data of (6) shown in
Note that, in the example of
That is,
t(n+2)=t(out)
In this case, in step S132, the user terminal 10 calculates the following data and records the data on the memory.
Processing to be performed on the data at the drone imaging time=t(n) is as follows.
Coordinate conversion processing to which the coordinate conversion matrix (Ct(out)TCt(n)) is applied is performed on the drone imaging position (mDronet(n)) in the camera coordinate system (Ct(n)) at the drone imaging time=t(n). That is, the following coordinate conversion processing is performed.
λ(mDronetn)=A·(CtoutTNED)·(NEDPDronetn)
Coordinates acquired from the above equation indicate the drone imaging position corresponding to the camera coordinate system at the drone flight path output time t(out). This coordinate position is recorded on the memory.
Further, processing to be performed on the data at the drone imaging time=t(n+1) is as follows.
Coordinate conversion processing to which the coordinate conversion matrix (Ct(out)TCt(n+1)) is applied is performed on the drone imaging position (mDronet(n+1)) in the camera coordinate system (Ct(n+1)) at the drone imaging time=t(n+1). That is, the following coordinate conversion processing is performed.
λ(mDronetn+1)=A·(CtoutTNED)·(NEDPDronetn+1)
Coordinates acquired from the above equation indicate the drone imaging position corresponding to the camera coordinate system at the drone flight path output time t(out). This coordinate position is recorded on the memory.
Further, processing to be performed on the data at the drone imaging time=t(n+2) is as follows.
The camera coordinate system (Ct(n+2)) at the drone imaging time=t(n+2) matches with the camera coordinate system (Ct(out)) at the drone flight path output time t(out).
Therefore, the coordinate conversion is unnecessary, and the drone imaging position (mDronet(n+2)) in the camera coordinate system (Ct(n+2)) at the drone imaging time=t(n+2) is recorded as it is on the memory.
The above recorded data is data recorded in the item (6) of
Next, processes in step S133 and subsequent steps in a flow of
(Step S133)
The user terminal 10 executes the following process in step S133.
The user terminal generates the simultaneous equations (Expression 7) including the correspondence equations between the drone positions at the three different positions in the NED coordinate system recorded on the memory and the drone imaging positions corresponding to the respective drone positions (imaging positions on the camera coordinate system at the time t(out)).
The generated simultaneous equations are the simultaneous equations described above with reference to
Note that, as the drone imaging positions (mDronetn) included in the three equations forming the simultaneous equations in (Expression 7), the positions calculated in step S132, that is, the drone imaging positions corresponding to the camera coordinate system at the drone flight path output time t(out) are used.
That is, the converted coordinate positions recorded in the item (6) of the memory recorded data described with reference to
(Step S134)
Next, in step S134, the user terminal 10 calculates a coordinate conversion matrix (Ct(out)TNED) in which parameters acquired by solving the simultaneous equations (Expression 7) generated in step S133 serve as matrix elements, that is, the coordinate conversion matrix (Ct(out)TNED) (Expression 8) for converting the position (NEDPX) in the NED coordinate system into a position (Ct(out)PX) in the camera coordinate system.
This coordinate conversion matrix (Ct(out)TNED) corresponds to the coordinate conversion matrix (CTNED) described above with reference to
(Step S135)
Next, in step S135, the user terminal 10 applies the coordinate conversion matrix (Ct(out)TNED) generated in step S134 and converts a drone flight path position in the NED coordinate system into a position in the camera coordinate system.
Note that the drone flight path (flown path or scheduled flight path) in the NED coordinate system is acquired from the drone 20 or the drone management server 30.
(Step S136)
Next, in step S136, the user terminal 10 outputs the flight path converted into position information in the camera coordinate system obtained by the coordinate conversion in step S135 to the display unit of the user terminal.
By performing the above processing, it is possible to accurately display the path information of the drone on the captured image of the drone.
That is, as illustrated in
[4. Configuration Example of Image Processing Device of Present Disclosure and Drone]
Next, a configuration example of the image processing device of the present disclosure and a drone will be described.
The user terminal 100 serving as the image processing device of the present disclosure is, for example, a camera-equipped communication terminal such as a smartphone. The user terminal is not limited to the smartphone and may be a device such as a PC or a camera device.
The user terminal 100 has a configuration capable of communicating with the drone 200 and a drone management server 300.
The drone 200 flies according to a predefined flight path by using, for example, communication information with the drone management server 300 or communication information with a GPS satellite 400.
As illustrated in
The camera 101 is used for, for example, processing of imaging the drone or capturing an image at the time of the SLAM processing.
The data processing unit 102 performs output control of a flight path of the drone described above. That is, the data processing unit performs, for example, processing of generating an AR image in which the flight path is superimposed on a real image of the drone or the like and displaying the AR image on the display unit 105.
Further, the data processing unit controls processing executed in the user terminal 100, such as the SLAM processing and image capturing control.
The data processing unit 102 includes, for example, a processor such as a CPU having a program execution function and executes processing in accordance with a program stored in the storage unit 103.
The storage unit (memory) 103 is used as a storage area and a work area of the program executed by the data processing unit 102. The storage unit (memory) is also used as a storage area for various parameters applied to the processing. The storage unit (memory) 103 includes a RAM, a ROM, and the like.
The communication unit 104 communicates with the drone 200 and the drone management server 300. For example, the communication unit performs, for example, processing of receiving flight path information of the drone 200 from the drone 200 or the drone management server 300.
The display unit 105 displays a camera-capturing image and further outputs the flight path information of the drone generated by the data processing unit 102. That is, the display unit displays the AR image in which the flight path is superimposed on the real image of the drone or the like.
The input unit 106 is an operation unit performed by the user and is used for various kinds of processing, for example, input processing requested by the user such as image capturing, start and end of path display, and the like.
The output unit 107 includes a sound output unit, an image output unit, and the like.
Next, a configuration of the drone 200 will be described.
The drone 200 includes a path planning unit 201, a path control unit 202, a positioning sensor (GPS information reception analysis unit) 203, and a communication unit 204.
The path planning unit 201 plans and determines a flight path of the drone 200. For example, the path planning unit plans and determines a specific flight path on the basis of information received from the drone management server 300.
The path control unit 202 executes flight control for causing the drone 200 to fly according to the flight path determined by the path planning unit 201.
The positioning sensor (GPS information reception analysis unit) 203 communicates with the GPS satellite 400, analyzes a current position (latitude, longitude, and altitude) of the drone 200 on the basis of the communication information with the GPS satellite 400, and outputs the analysis information to the path control unit 202.
The path control unit 202 refers to the input information from the positioning sensor (GPS information reception analysis unit) 203 and executes flight control for causing the drone 200 to fly according to the flight path determined by the path planning unit 201.
The communication unit 204 communicates with the drone management server 300 and the user terminal 100.
Note that the processing example of displaying the flight path of the drone has been described in the above-described embodiment. However, the processing of the present disclosure is not limited to display of the flight path of the drone and is also applicable to processing of displaying path information of other moving objects such as a robot and an autonomous vehicle, for example.
Similar processing can be performed by replacing the drone in the above-described embodiment with the robot or the autonomous vehicle.
[5. Summary of Configurations of Present Disclosure]
Hereinabove, the embodiments of the present disclosure have been described in detail by referring to specific embodiments. However, it is obvious that those skilled in the art can make modifications and substitutions of the embodiments, without departing from the scope of the present disclosure. That is, the present invention has been described in the form of illustration, and should not be interpreted in a limited manner. The claims should be taken into consideration in order to determine the gist of the present disclosure.
Note that the technology disclosed in this specification can be configured as follows.
(1) An image processing device including
(2) An image processing method executed in an image processing device, in which:
(3) The image processing method according to (2), in which:
(4) The image processing method according to (2) or (3), in which
(5) The image processing method according to any one of (2) to (4), in which
(6) The image processing method according to (5), in which
(7) The image processing method according to (5) or (6), in which
(8) The image processing method according to any one of (5) to (7), in which
(9) The image processing method according to any one of (6) to (8), in which
(10) The image processing method according to (9), in which
(11) The image processing method according to (10), in which
(12) The image processing method according to any one of (2) to (11), in which
(13) The image processing method according to any one of (2) to (12), in which
(14) A program for causing an image processing device to execute image processing, in which:
Further, the series of processes described in the specification can be executed by hardware, software, or a combined configuration of both. In a case where the processes are executed by software, the processes can be executed by installing a program in which the processing sequence is recorded in a memory inside a computer incorporated into dedicated hardware and executing the program, or by installing a program in a general purpose computer that can execute various processes and executing the program. For example, the program can be recorded on a recording medium in advance. The program can be installed in the computer from the recording medium, or can also be received via a network such as a local area network (LAN) or the Internet and be installed in a recording medium such as a built-in hard disk.
Note that the various processes described in the specification not only are executed in time series in accordance with the description, but also are executed in parallel or individually depending on a processing capacity of a device that executes the processes or as necessary. Further, in this specification, a system is a logical set configuration of a plurality of devices, and is not limited to a system in which devices having respective configurations are in the same housing.
As described above, an embodiment of the present disclosure realizes a configuration capable of accurately displaying a flight path of a drone on an actually captured image of the drone.
Specifically, for example, the configuration includes a data processing unit that displays a moving path of a moving device such as a drone on a display unit that displays a camera-capturing image of the moving device. The data processing unit generates a coordinate conversion matrix for performing coordinate conversion processing of converting position information according to a first coordinate system, for example, the NED coordinate system indicating the moving path of the moving device into a second coordinate system, for example, the camera coordinate system capable of specifying a pixel position of a display image on the display unit and outputs, to the display unit, the moving path having position information according to the camera coordinate system generated by coordinate conversion processing to which the generated coordinate conversion matrix is applied.
This configuration can accurately display a flight path of a drone on an actually captured image of the drone.
Number | Date | Country | Kind |
---|---|---|---|
2019-214975 | Nov 2019 | JP | national |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/JP2020/039509 | 10/21/2020 | WO |
Publishing Document | Publishing Date | Country | Kind |
---|---|---|---|
WO2021/106436 | 6/3/2021 | WO | A |
Number | Name | Date | Kind |
---|---|---|---|
9892516 | Moteki | Feb 2018 | B2 |
20130338915 | Mizuochi | Dec 2013 | A1 |
20170201617 | So | Jul 2017 | A1 |
20190011912 | Lockwood | Jan 2019 | A1 |
20200167603 | Ung | May 2020 | A1 |
Number | Date | Country |
---|---|---|
2994508 | Feb 2017 | CA |
110174903 | Aug 2019 | CN |
2005-233712 | Sep 2005 | JP |
2008501955 | Jan 2008 | JP |
2012-194175 | Oct 2012 | JP |
5192598 | May 2013 | JP |
2013218459 | Oct 2013 | JP |
2017-116339 | Jun 2017 | JP |
2019-032234 | Feb 2019 | JP |
2019106000 | Jun 2019 | JP |
2019133450 | Aug 2019 | JP |
2012118232 | Sep 2012 | WO |
WO-2018043299 | Mar 2018 | WO |
Entry |
---|
Translation of JP2019032234 (specification only). Accessed via J-Plat-Pat. https://www.j-platpat.inpit.go.jp/. on Oct. 24, 2023. (Year: 2019). |
International Search Report and Written Opinion of PCT Application No. PCT/JP2020/039509, issued on Jan. 26, 2021, 09 pages of ISRWO. |
Number | Date | Country | |
---|---|---|---|
20220415193 A1 | Dec 2022 | US |