The present application claims the priority of Chinese Patent Application No. CN 202311391909X, filed on Oct. 25, 2023, the content of which is incorporated herein by reference in its entirety.
The present application relates to the field of positioning technology, and in particular to a method, device, movable platform and related products for determining a pose of a camera system.
With development of image processing technology, visual positioning technology has been widely used. Taking positioning of a camera system as an example, in the related technology, visual positioning processing is mainly performed on images collected by each camera in the camera system to determine the pose information of the camera system. However, in the related art, there is a problem that the determined pose information of the camera system is inaccurate.
Based on this, it is necessary to provide a method, device, movable platform and related products for determining the pose of a camera system in order to address the above-mentioned technical problems, so as to improve accuracy of the determined pose information of the camera system.
In one embodiment, a method for determining a pose of a camera system may include obtaining a current frame fisheye image captured by a fisheye camera in the camera system and a current frame ordinary image captured by a preset type camera in the camera system; tracking feature points of the current frame fisheye image to obtain current frame fisheye feature points of the current frame fisheye image, and tracking feature points of the current frame ordinary image to obtain current frame ordinary feature points of the current frame ordinary image; and determining current pose information of the camera system according to the current frame fisheye feature points and the current frame ordinary feature points.
In another embodiment, a device for determining a pose of a camera system may include a camera system comprising a fisheye camera and a preset type of camera; and circuitry configured to acquire a current frame fisheye image by the fisheye camera in a camera system and a current frame ordinary image by the preset type of camera in the camera system; track feature points of the current frame fisheye image to obtain current frame fisheye feature points of the current frame fisheye image, and track feature points of the current frame ordinary image to obtain current frame ordinary feature points of the current frame ordinary image; and determine current pose information of the camera system according to the current frame fisheye feature points and the current frame ordinary feature points.
In one embodiment, a movable platform, comprising the camera system; at least one processor; and
at least one memory storing a computer program; wherein the computer program, when executed by the at least one processor, causes the at least one processor to perform the method according to one embodiment of the present application.
In one embodiment, a computer device may comprise at least one memory and at least one processor, wherein the at least one memory stores a computer program which, when executed by the at least one processor, causes the at least one processor to perform the method according to one embodiment of the present application.
In one embodiment, a non-transitory computer-readable storage medium has a computer program stored thereon which, when executed by a processor, causes the processor to perform the method according to one embodiment of the present application.
The camera system pose determination method, device, movable platform and related products are provided by some embodiments of the present application. In one embodiment, the camera system pose determination method includes: obtaining a current frame fisheye image captured by a fisheye camera in the camera system and a current frame ordinary image captured by a preset type of camera, tracking feature points of the current frame fisheye image to obtain current frame fisheye feature points of the current frame fisheye image, and tracking feature points of the current frame ordinary image to obtain current frame ordinary feature points of the current frame ordinary image, and determining current pose information of the camera system based on the current frame fisheye feature points and the current frame ordinary feature points. The above method can compensate for the preset type of camera by using a fisheye camera with a large field of view in the camera system. The present type of camera's field of view is small, and the fisheye image collected by the fisheye camera makes up for the weak texture defect of ordinary images collected by the preset type of camera, so that more feature points from the surrounding environment can be obtained in the process of determining the pose of the camera system. Further determining the pose of the camera system by more feature points can improve accuracy of the determined pose information of the camera system; at the same time, since the fisheye camera has a unique imaging model, the movement of the same feature point in the inter-frame image is small, stable and not easy to lose, thereby improving accuracy of feature point tracking. On this basis, the accuracy of the determined pose of the camera system can be further improved.
It should be understood that the above general description and the detailed description that follows are exemplary and explanatory only and do not limit the present application.
In order to explain the technical features of embodiments of the present disclosure more clearly, the drawings used in the present disclosure are briefly introduced as follow. Obviously, the drawings in the following description are some exemplary embodiments of the present disclosure. Ordinary person skilled in the art may obtain other drawings and features based on these disclosed drawings without inventive efforts.
In order to make the purpose, technical solution and advantages of the present application more clearly understood, the present application is further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are only used to explain the present application and are not used to limit the present application.
In the field of photography, camera systems (including multiple cameras) are required in some photography scenes. In practical applications, the pose information of the camera system needs to be estimated. In the related art, the images collected by each ordinary camera in the camera system are mainly processed by visual positioning to determine the pose information of the camera system. However, in the related art, due to the small field of view of ordinary cameras, there will be problems such as missing feature points in adjacent frame images and feature points are easily lost. Accordingly, there are problems such as inaccurately determined pose information of the camera system. Based on this, an embodiment of the present application provides a method for determining the pose of a camera system, which can improve accuracy of the determined pose information of the camera system.
One embodiment of the present application provides a method for determining a pose of a camera system which can be applied to scenes where each camera in the camera system moves rapidly, and can be, for example, applied to a movable platform as shown in
S100: obtaining a current frame fisheye image captured by a fisheye camera in a camera system and a current frame ordinary image captured by a preset type camera.
Among them, the camera system may include a fisheye camera and a preset type of camera. The fisheye camera and the preset type of camera are deployed at different locations. It should be noted here that the image captured by the fisheye camera can be called a fisheye image, and relatively, the image captured by the preset type of camera can be called an ordinary image.
Optionally, the current frame fisheye image captured by the fisheye camera can be understood as the fisheye image captured by the fisheye camera at the current moment; the current frame ordinary image captured by the preset type of camera can be understood as the ordinary image captured by the preset type of camera at the current moment.
In practical applications, due to factors such as different spacings between the fisheye camera and the preset type of camera and the computer device, and different transmission speeds, a computer device can naturally use a synchronous method to receive the current frame fisheye image sent by the fisheye camera and the current frame ordinary image sent by the preset type of camera, or use an asynchronous method to receive the current frame fisheye image sent by the fisheye camera and the current frame ordinary image sent by the preset type of camera.
S200: tracking feature points of the current frame fisheye image to obtain current frame fisheye feature points of the current frame fisheye image, and tracking feature points of the current frame ordinary image to obtain current frame ordinary feature points of the current frame ordinary image.
Optionally, the computer device can use a feature point tracking algorithm to track the feature points of the current frame fisheye image to obtain the current frame fisheye feature points of the current frame fisheye image. At the same time, it can use a feature point tracking algorithm to track the feature points of the current frame ordinary image to obtain the current frame ordinary feature points of the current frame ordinary image.
Optionally, the feature point tracking algorithm may be an optical flow tracking method, a target tracking method, a feature point matching method, etc., which is not limited in the embodiments of the present application.
In addition, the computer device can also pre-train a feature point tracking model, and then input the current frame fisheye image into the feature point tracking model, which tracks the feature points of the current frame fisheye image and outputs the current frame fisheye feature points of the current frame fisheye image; at the same time, the current frame ordinary image can also be input into the feature point tracking model, which tracks the feature points of the current frame ordinary image and outputs the current frame ordinary feature points of the current frame ordinary image.
Among them, the above-mentioned feature point tracking model can include at least one of a convolutional neural network model, a fully connected neural network model, a recurrent neural network model, a deep belief network model, a deep autoencoder or a generative adversarial network model, which is not limited to the embodiments of the present application.
S300: determining current pose information of the camera system according to the current frame fisheye feature points and the current frame ordinary feature points.
In practical applications, the computer device can perform fusion processing, rotation processing, and translation transformation processing on the current frame fisheye feature points and the current frame ordinary feature points to obtain the current pose information of the camera system. Optionally, the above current pose information may include a rotation matrix and a translation vector of the camera system. Among them, the rotation matrix at the i-th moment can be expressed as RBiW, and the translation vector at the i-th moment can be expressed as tBiW.
It should be noted here that at time zero, the rotation matrix of the camera system can be the unit matrix, and the translation vector of the camera system can be the zero vector.
The technical solution in one embodiment of the present application is to obtain a current frame fisheye image captured by a fisheye camera in a camera system and a current frame ordinary image captured by a preset type of camera, track feature points of the current frame fisheye image to obtain current frame fisheye feature points of the current frame fisheye image, and track feature points of the current frame ordinary image to obtain current frame ordinary feature points of the current frame ordinary image, and determine the current pose information of the camera system according to the current frame fisheye feature points and the current frame ordinary feature points; the above method can compensate for the small field of view of the preset type camera by a fisheye camera with a large field of view in the camera system, and compensate for the defect of weak texture in the ordinary image captured by the preset type of camera by a fisheye image captured by the fisheye camera, so that more feature points from the surrounding environment can be obtained in the process of determining the pose of the camera system. As such, the pose of the camera system can be determined by more feature points, which can further improve the accuracy of the determined pose of the camera system; at the same time, because the fisheye camera has a unique imaging model, the movement of the same feature point in inter-frame images is small, stable and not easy to lose, thereby improving accuracy of feature point tracking, and on this basis, the accuracy of the determined pose of the camera system can be further improved.
In one embodiment, the process of tracking the feature points of the current frame fisheye image to obtain the current frame fisheye feature points of the current frame fisheye image is described below. In one embodiment, as shown in
S210: obtaining historical frame fisheye feature points of historical frame fisheye images captured by the fisheye camera.
Among them, the above-mentioned historical frame fisheye images can include any multiple frames of fisheye images before the current frame. For example, the historical frame fisheye images can include any two frames of fisheye images, any three frames of fisheye images, any four frames of fisheye images, etc. before the current frame.
In practical applications, the historical frame fisheye images collected by the fisheye camera can be stored in a local location, a cloud location, a disk or a hard disk, etc. Correspondingly, the computer device can obtain the historical frame fisheye images collected by the fisheye camera from a local location, a cloud location, a disk or a hard disk, etc., and then use a feature point extraction algorithm to extract feature points of the historical frame fisheye images to obtain feature points of the historical frame fisheye images, that is, the historical frame fisheye feature points of the historical frame fisheye images.
Optionally, the feature point extraction algorithm may be a grayscale-based corner detection algorithm, a fast corner detection algorithm, or a deep learning-based corner detection algorithm.
S220: tracking inter-frame feature points of the current frame fisheye image according to the historical frame fisheye feature points to obtain the current frame fisheye feature points of the current frame fisheye image.
In practical applications, the computer device may use a feature point tracking algorithm to track the inter-frame feature points of the current frame fisheye image based on the historical frame fisheye feature points to obtain the current frame fisheye feature points of the current frame fisheye image.
In one embodiment, the camera system includes at least one fisheye camera; if the historical frame fisheye image includes a previous frame fisheye image of the current frame fisheye image, and the previous frame fisheye image is the initial frame fisheye image, then as shown in
S211: obtaining the previous frame fisheye image captured by each fisheye camera.
In one embodiment of the present application, the historical frame fisheye image includes a previous frame fisheye image of the current frame fisheye image. Wherein, when the previous frame fisheye image is an initial frame fisheye image, the initial frame fisheye image can be understood as a fisheye image captured by the fisheye camera at the initial moment during the shooting process of the camera system.
In an embodiment of the present application, the camera system may include at least one fisheye camera. Specifically, the computer device may obtain the previous frame fisheye image captured by each fisheye camera from a local location, a cloud, a disk, or a hard disk.
In practical applications, the sizes of fisheye images captured by different fisheye cameras may be equal or unequal, which is not limited in the embodiments of the present application.
S212: extracting a preset number of fisheye feature points from a target previous frame fisheye image among the previous frame fisheye images, wherein the target previous frame fisheye image refers to any one of the previous frame fisheye images.
The computer device may select any previous frame fisheye image from all previous frame fisheye images as the target previous frame fisheye image, and then use a feature point extraction algorithm to extract a preset number of fisheye feature points from the target previous frame fisheye image.
Optionally, the preset number of fisheye features may be user-defined or determined by historical experience values. However, in the embodiment of the present application, the preset number of fisheye features may be equal to the maximum number of feature points of the fisheye image.
S213: according to the fisheye feature points extracted from the target previous frame fisheye image, the feature points on other previous frame fisheye images each are extracted in turn to obtain the previous frame fisheye feature points of the previous frame fisheye images captured by each fisheye camera.
Optionally, the computer device can pre-train an algorithm model, and then input the feature points extracted from the target previous frame fisheye image and each other previous frame fisheye image into the algorithm model. The algorithm model extracts the feature points on the other previous frame fisheye images among the previous frame fisheye images in turn, and then outputs the previous frame fisheye feature points of each other previous frame fisheye image.
In addition, the computer device can also use a feature point extraction algorithm to extract feature points from each other previous frame fisheye image based on the feature points extracted from the target previous frame fisheye image, and obtain the previous frame fisheye feature points of other previous fisheye images.
It should be noted here that the previous frame fisheye feature points of the previous frame fisheye image captured by each fisheye camera may include feature points extracted from the target previous frame fisheye image and feature points of each other previous frame fisheye image.
In an embodiment of the present application, if the previous frame fisheye image is not the initial frame fisheye image, the method for obtaining the previous frame fisheye feature points of the previous frame fisheye image can include obtaining the fisheye feature points of the previous frame fisheye image corresponding to the fisheye image at the previous moment, and then tracking the inter-frame feature points of the other previous frame fisheye image based on the fisheye feature points of the previous frame fisheye image corresponding to the fisheye image at the previous moment to obtain the previous frame fisheye feature points of the other previous frame fisheye image.
It can be understood that, except for the initial frame fisheye image, the method for obtaining the fisheye feature points of each other previous frame fisheye image is to track the inter-frame feature points of the other previous frame fisheye image according to the fisheye feature points of the previous frame fisheye image corresponding to the previous moment so as to obtain the fisheye feature points of the other previous frame fisheye image.
In one embodiment, as shown in
S2131: generating an image sequence from the previous frame fisheye images; the target previous frame fisheye image in the image sequence is the starting fisheye image.
Optionally, the computer device can use the target previous frame fisheye image as the starting fisheye image, and sort the previous frame fisheye images according to information such as the deployment position of each fisheye camera, the resolution of the captured image or the production time so as to generate an image sequence.
S2132: For any previous frame fisheye image in the image sequence, tracking feature points from the previous frame fisheye image based on feature points extracted from all previous frame fisheye images before the previous frame fisheye image.
For example, the image sequence includes a previous fisheye image 1, a previous fisheye image 2, a previous fisheye image 3 and a previous fisheye image 4, wherein the previous fisheye image 1 is the target previous frame fisheye image. For the previous fisheye image 2, the feature points of the previous fisheye image 2 can be tracked according to the feature points extracted from the previous fisheye image 1, so as to track the feature points from the previous fisheye image 2; for the previous fisheye image 4, the feature points of the previous fisheye image 4 can be tracked according to the feature points extracted from the previous fisheye image 1, the previous fisheye image 2 and the previous fisheye image 3, so as to track the feature points from the previous fisheye image 4.
S2133: If the number of tracked feature points on the previous frame fisheye image is equal to the preset number of fisheye features, the tracked feature points are used as the previous frame fisheye feature points of the previous frame fisheye image.
In one embodiment, after the step in S2132 is performed, as shown in
S2134: If the number of feature points tracked on the previous frame fisheye image is less than the preset number of fisheye feature points, determine the number of compensating feature points of the previous frame fisheye image according to the number of feature points tracked on the previous frame fisheye image and the preset number of fisheye feature points.
In an embodiment of the present application, for any previous frame fisheye image in an image sequence, it can be determined whether the number of feature points tracked on the previous frame fisheye image is less than a preset number of fisheye feature points. If so, feature point compensation extraction can be performed on the previous frame fisheye image so that the number of previous frame fisheye feature points of the previous frame fisheye image is equal to the preset number of fisheye feature points.
In one embodiment, the computer device can obtain the number of compensated feature points of the previous frame fisheye image by subtracting the number of feature points tracked on the previous frame fisheye image from the preset number of fisheye features point.
For example, if the number of tracked feature points on the previous frame fisheye image is a, and the preset number of fisheye feature points is Thƒ, then the number of compensated feature points of the previous frame fisheye image may be equal to Thƒ−a.
S2135: Performing feature point compensation extraction on the previous frame fisheye image according to the number of compensated feature points of the previous frame fisheye image to obtain the previous frame fisheye feature points of the previous frame fisheye image.
Furthermore, a feature point extraction algorithm may be used to extract feature points of the number of compensation feature points from the previous frame fisheye image, so as to complete feature point compensation extraction of the previous frame fisheye image and obtain the previous frame fisheye feature points of the previous frame fisheye image.
The technical solution in some embodiments of the present application can obtain the fisheye feature points of the previous frame fisheye images captured by the fisheye camera, and track the inter-frame feature points of the current frame fisheye image based on the previous frame fisheye feature points, so as to improve the accuracy of the current frame fisheye feature points of the current frame fisheye image finally obtained; at the same time, the method does not require complex algorithms to be implemented, and the process is relatively simple, thereby improving the speed of inter-frame feature point tracking.
The process of tracking the feature points of the current frame ordinary image to obtain the current frame ordinary feature points of the current frame ordinary image is described below. In one embodiment, as shown in
S230: obtaining historical frame ordinary feature points of historical frame ordinary images captured by a preset type of camera.
The above-mentioned historical frame ordinary images may include any multiple frames of ordinary images before the current frame. For example, the historical frame ordinary image may include any two frames of ordinary images, any three frames of ordinary images, any four frames of ordinary images, etc. before the current frame.
In practical applications, the historical frame ordinary images collected by the preset type of camera can be stored in a local location, a cloud location, a disk or a hard disk, etc. Correspondingly, the computer device can obtain the historical frame ordinary images collected by the preset type of camera from a local location, a cloud location, a disk or a hard disk, etc., and then use a feature point extraction algorithm to extract feature points of the historical frame ordinary images to obtain feature points of the historical frame ordinary images, that is, the historical frame ordinary feature points of the historical frame ordinary images.
S240: tracking inter-frame feature points of the current frame ordinary image according to the historical frame ordinary feature points of the historical frame ordinary images to obtain candidate ordinary feature points of the current frame ordinary image.
In practical applications, the computer device may use a feature point tracking algorithm to track the inter-frame feature points of the current frame ordinary image based on the historical frame ordinary feature points, and obtain the current frame ordinary feature points of the current frame ordinary image.
S250: If the number of candidate ordinary feature points of the current frame ordinary image is equal to the preset number of ordinary features, the candidate ordinary feature points of the current frame ordinary image are determined as the current frame ordinary feature points of the current frame ordinary image.
Among them, the computer device can determine whether the number of candidate ordinary feature points of the current frame ordinary image is equal to the preset number of ordinary features. If so, the candidate ordinary feature points of the current frame ordinary image are determined as the current frame ordinary feature points of the current frame ordinary image.
Optionally, the above-mentioned preset number of ordinary features can be user-defined or determined by historical experience values, but in one embodiment of the present application, the preset number of ordinary features can be equal to the maximum number of feature points of an ordinary image.
At the same time, after the step in S240 is executed, the method may further include: if the number of candidate ordinary feature points of the current frame ordinary image is less than the preset number of ordinary feature points, tracking the feature points of the current frame ordinary image according to the historical frame ordinary feature points of the historical frame ordinary images captured by the preset type of camera, and determining the current frame ordinary feature points of the current frame ordinary image.
In an embodiment of the present application, for the current frame ordinary image, if it is determined that the number of candidate ordinary feature points of the current frame ordinary image is less than the preset number of ordinary feature points, the historical frame ordinary feature points of the historical frame ordinary images captured by a preset type of camera can be obtained, and then a feature point tracking algorithm is used to track the feature points of the current frame ordinary image based on the historical frame ordinary feature points of the historical frame ordinary images to obtain the current frame ordinary feature points of the current frame ordinary image.
The historical frame ordinary feature points of the historical frame ordinary image may be obtained by extracting feature points from a previous frame ordinary image using a feature point extraction algorithm.
In one embodiment, the camera system includes at least one fisheye camera; if the historical frame fisheye image includes a previous frame fisheye image of the current frame fisheye image, the historical frame ordinary image includes a previous frame ordinary image of the current frame ordinary image, and the previous frame fisheye image is an initial frame fisheye image, and the previous frame ordinary image is an initial frame ordinary image, then as shown in
S231: obtaining the previous frame fisheye feature points of the previous frame fisheye images captured by each fisheye camera.
In an embodiment of the present application, the camera system may include at least one fisheye camera. Specifically, the computer device may obtain the previous frame fisheye images captured by each fisheye camera from a local location, a cloud, a disk, or a hard disk.
In practical applications, the sizes of fisheye images captured by different fisheye cameras may be equal or unequal, which is not limited in the embodiments of the present application.
The computer device may use a feature point extraction algorithm to extract feature points from each previous frame fisheye image to obtain the previous frame fisheye feature points of each previous frame fisheye image.
S232: tracking feature points of the previous frame ordinary image according to the previous frame fisheye feature points to obtain candidate ordinary feature points of the previous frame ordinary image.
When the previous frame ordinary image is an initial frame ordinary image, the initial frame ordinary image can be understood as an ordinary image captured by a preset type of camera at an initial moment during the camera system shooting process.
Optionally, based on the previous frame fisheye feature points of each previous frame fisheye image obtained in the previous steps, a feature point tracking algorithm can be used to track the feature points of the previous frame ordinary image according to the previous frame fisheye feature points to obtain candidate ordinary feature points of the previous frame ordinary image.
S233: If the number of candidate ordinary feature points of the previous frame ordinary image is equal to the preset number of ordinary features, the candidate ordinary feature points of the previous frame ordinary image are determined as the previous frame ordinary feature points of the previous frame ordinary image.
In an embodiment of the present application, if the previous frame ordinary image is not the initial frame ordinary image, the method for obtaining the previous frame ordinary feature points of the previous frame ordinary image may include obtaining the ordinary feature points of the ordinary image at the previous moment corresponding to the previous frame ordinary image, and based on the ordinary feature points of the ordinary image at the previous moment corresponding to the previous frame ordinary image, performing inter-frame feature point tracking on the previous frame ordinary image to obtain candidate ordinary feature points of the previous frame ordinary image, and if the number of candidate ordinary feature points of the previous frame ordinary image is equal to the preset number of ordinary feature points, the candidate ordinary feature points of the previous frame ordinary image are determined as the previous frame ordinary feature points of the previous frame ordinary image.
It can be understood that, except for the initial frame ordinary image, the method of obtaining the ordinary feature points of each other frame ordinary image includes tracking the inter-frame feature points of the other previous frame ordinary images according to the ordinary feature points of the previous frame ordinary images corresponding to the ordinary image of the previous moment, and obtain the candidate ordinary feature points of the other previous frame ordinary images. If the number of candidate ordinary feature points of the other frame ordinary images is equal to the preset number of ordinary feature points, the candidate ordinary feature points of the other frame ordinary images are determined as the ordinary feature points of the other previous frame ordinary images.
In one embodiment, after the step in S232 is performed, as shown in
S234: If the number of candidate ordinary feature points of the previous frame ordinary image is less than the preset number of ordinary feature points, determining the number of compensating feature points of the previous frame ordinary image according to the total number of candidate ordinary feature points of the previous frame ordinary image and the preset number of ordinary feature points.
In an embodiment of the present application, when it is determined that the number of candidate ordinary feature points of the previous frame ordinary image is less than the preset number of ordinary feature points, the feature points of the previous frame ordinary image can be compensated so that the number of ordinary feature points of the previous frame ordinary image is equal to the preset number of ordinary feature points.
Optionally, the computer device may obtain the number of compensated feature points of the previous frame ordinary image by subtracting the total number of candidate ordinary feature points of the previous frame ordinary image from the preset number of ordinary feature points.
For example, if the number of feature points extracted from the previous frame ordinary image is b, and the preset number of ordinary feature points is Thp, then the number of compensated feature points of the previous frame ordinary image may be equal to Thp−b.
S235: performing feature point compensation extraction on the previous frame ordinary image according to the number of compensated feature points of the previous frame ordinary image.
Furthermore, a feature point extraction algorithm may be used to perform feature point compensation extraction on the previous frame ordinary image, and a number of feature points equal to the number of compensated feature points may be extracted from the previous frame ordinary image.
S236: determining the previous frame ordinary feature points of the previous frame ordinary image according to a result of the feature point compensation extraction of the previous frame ordinary image and the candidate ordinary feature points.
In practical applications, the number of feature points of compensated feature points extracted from the previous frame ordinary image (ie, feature point compensation extraction results) and candidate ordinary feature points can be determined as the previous frame ordinary feature points of the previous frame ordinary image.
The technical solution in one embodiment of the present application can obtain the previous frame ordinary feature points of the previous frame ordinary image captured by a preset type of camera, and track the inter-frame feature points of the current frame ordinary image based on the previous frame ordinary feature points of the previous frame ordinary image to obtain candidate ordinary feature points of the current frame ordinary image. When the number of candidate ordinary feature points of the current frame ordinary image is equal to the preset number of ordinary feature points, the candidate ordinary feature points of the current frame ordinary image are determined as the current frame ordinary feature points of the current frame ordinary image, so that the current frame fisheye feature points of the current frame fisheye image finally obtained are more accurate. At the same time, the method does not require complex algorithms to be implemented, and the process is relatively simple, thereby improving the speed of inter-frame feature point tracking.
The following is an explanation of the process of determining the current pose information of the camera system based on the current frame fisheye feature points and the current frame ordinary feature points. In one embodiment, as shown in
S310: obtaining previous frame fisheye feature points of a previous frame fisheye image captured by a fisheye camera, and previous frame ordinary feature points of a previous frame ordinary image captured by a preset type of camera.
The computer device may obtain previous frame fisheye feature points of a previous frame fisheye image captured by a predetermined fisheye camera, and previous frame ordinary feature points of a previous frame ordinary image captured by a preset type of camera.
In addition, the computer device can also obtain the previous frame fisheye image captured by the fisheye camera and the previous frame ordinary image captured by a preset type camera from a local location, a cloud location, a disk or a hard disk, and then use a feature point extraction algorithm to extract feature points from the previous frame fisheye image to obtain the previous frame fisheye feature points of the previous frame fisheye image; at the same time, a feature point extraction algorithm can also be used to extract feature points from the previous frame ordinary image to obtain the previous frame ordinary feature points of the previous frame ordinary image.
S320: constructing a current pose observation model of the camera system according to the current frame fisheye feature points, the previous frame fisheye feature points, the current frame ordinary feature points and the previous frame ordinary feature points.
Optionally, the computer device can use a pose estimation algorithm to perform pose estimation processing on the current frame fisheye feature points, the previous frame fisheye feature points, the current frame ordinary feature points and the previous frame ordinary feature points to obtain the current pose observation model of the camera system.
Optionally, the above-mentioned pose estimation algorithm can be a model-based pose estimation algorithm, a feature-based pose estimation algorithm, or a deep learning-based camera pose estimation algorithm, etc., which is not limited to this embodiment of the present application.
In one embodiment, the current pose observation model includes a depth information observation model and a non-depth information observation model; the step of constructing the current pose observation model of the camera system according to the current frame fisheye feature points, the previous frame fisheye feature points, the current frame ordinary feature points and the previous frame ordinary feature points in the above S320 can be implemented in the following way: determining the depth information observation model according to the current frame fisheye feature points and the current frame ordinary feature points; and constructing the non-depth information observation model according to the current frame fisheye feature points, the previous frame fisheye feature points, the current frame ordinary feature points and the previous frame ordinary feature points.
In one embodiment of the present application, the computer device may obtain some or all feature points from the current frame fisheye feature points and the current frame ordinary feature points, and then process the obtained feature points to obtain the depth information observation model. Alternatively, the computer device may directly process the current frame fisheye feature points and the current frame ordinary feature points according to the construction strategy of the depth information observation model to obtain the depth information observation model.
At the same time, the computer device can obtain some or all feature points from the current frame fisheye feature points, the previous frame fisheye feature points, the current frame ordinary feature points and the previous frame ordinary feature points, and then process the obtained feature points to obtain a non-depth information observation model. Alternatively, the computer device can also directly process the current frame fisheye feature points, the previous frame fisheye feature points, the current frame ordinary feature points and the previous frame ordinary feature points according to the construction strategy of the non-depth information observation model to obtain a non-depth information observation model.
S330: construct a current pose optimization function of the camera system according to the current pose observation model.
In one embodiment of the present application, there may be multiple current pose observation models. Based on the current pose observation model obtained in the previous steps, an arithmetic operation can be performed on the current pose observation model to obtain a current pose optimization function of the camera system.
Optionally, the arithmetic operation may be at least one of addition, subtraction, multiplication, division, logarithmic operation and exponential operation.
S340: determining current pose information of the camera system based on the current pose optimization function.
In practical applications, the computer device can use an optimization algorithm to solve the current pose optimization function to obtain the current pose information of the camera system. Optionally, the above-mentioned optimization algorithm can be a gradient descent method, a Newton method, a conjugate gradient method and/or a genetic algorithm, etc. In the embodiment of the present application, the above-mentioned optimization algorithm can be a least squares method, and the least squares method can be a Levenberg-Marquardt method or a Gauss-Newton iteration method, etc.
The technical solution in one embodiment of the present application obtains the previous frame fisheye feature points of the previous frame fisheye image captured by the fisheye camera, and the previous frame ordinary feature points of the previous frame ordinary image captured by the preset type camera, and constructs a current pose observation model of the camera system according to the current frame fisheye feature points, the previous frame fisheye feature points, the current frame ordinary feature points and the previous frame ordinary feature points, and constructs a current pose optimization function of the camera system according to the current pose observation model, and determines the current pose information of the camera system based on the current pose optimization function; the method can construct the current pose observation model of the camera system through the feature points of adjacent frame images, so that the constructed current pose observation model of the camera system is more accurate, and on this basis, the current pose information of the camera system can also be accurately determined.
In one embodiment, as shown in
S321: obtaining depth feature points having depth information from among the current frame fisheye feature points and the current frame ordinary feature points; the depth feature points include three-dimensional position coordinates of the feature points.
In practical applications, the computer device may filter feature points having depth information, i.e. depth feature points, from the current frame fisheye feature points and the current frame ordinary feature points.
It should be noted here that the current frame fisheye feature points may be feature points with depth information, and the current frame ordinary feature points may also be feature points with depth information.
S322: determining unit sphere coordinates of each depth feature point according to the current position coordinates of each depth feature point in the current frame image.
Optionally, the current frame image to which the depth feature point belongs may be a current frame fisheye image or a current frame ordinary image. Optionally, the current position coordinates of the depth feature point may be understood as the position coordinates corresponding to the depth feature point in the current frame image.
Among them, for any depth feature point, the current position coordinates of the depth feature point in the current frame image can be obtained according to the result of feature point extraction on the current frame image, and then the current position coordinates of the depth feature point in the current frame image can be converted to obtain the unit sphere coordinates of the depth feature point.
S323: determining a depth residual function of each depth feature point according to the current position coordinates of each depth feature point in a world coordinate system and the unit sphere coordinates of each depth feature point, as well as an external parameter between the current camera system coordinate system and the camera coordinate system corresponding to the current frame image to which each depth feature point belongs.
Among them, the camera coordinate system corresponding to the current frame image to which the above-mentioned depth feature points belong can be understood as the coordinate system corresponding to the camera that collects the current frame image to which the depth feature points belong. It should be noted here that if the current frame image is a current frame fisheye image, the camera of the current frame image can be a fisheye camera that collects the current frame image; if the current frame image is a current frame ordinary image, the camera of the current frame image can be a preset type of camera that collects the current frame image.
Optionally, the above-mentioned camera system coordinate system refers to a coordinate system created based on all cameras in the camera system. In the embodiment of the present application, the camera system coordinate system may be the same or different at different times, and the embodiment of the present application is not limited to this.
Optionally, the computer device can perform arithmetic operations on the current position coordinates of each depth feature point in the world coordinate system and the unit sphere coordinates of each depth feature point, as well as the external parameter between the current camera system coordinate system and the camera coordinate system corresponding to the current frame image to which each depth feature point belongs, to obtain the depth residual function of each depth feature point.
For the convenience of calculation, the camera system coordinate system corresponding to the initial frame can be used as the world coordinate system. At the same time, the above external parameter can include a rotation matrix and a translation vector.
In an embodiment of the present application, if the camera system includes m (m is greater than or equal to 1) fisheye cameras, the m fisheye cameras Cƒ can be respectively expressed as Cƒ1 Cƒ2
. . .
Cƒk
. . .
Cƒm, k is greater than or equal to 1 and less than or equal to m, and
Among them, XaW in formula (1) represents the current position coordinates of the depth feature point P1a in the world coordinate system, represents the unit sphere coordinates of the depth feature point P1a, RBCm and tBCm represents the respective external parameters (i.e., the rotation matrix and the translation vector) between the current camera system coordinate system and the camera coordinate system corresponding to the current frame image to which the depth feature point P1a belongs.
S324: determining a depth information observation model according to the depth residual function of each depth feature point.
Furthermore, the depth residual function of each depth feature point can be subjected to arithmetic operations to obtain a depth information observation model. In an embodiment of the present application, the second norm of the depth residual function of each depth feature point can be calculated, and then the second norm of the depth residual function of each depth feature point can be squared, and then the squares of the second norms of the depth residual function of each depth feature point can be summed to obtain the depth information observation model.
In one embodiment of the present application, if the depth information observation model represents F1, then F1 can be expressed by the following formula (2).
At the same time, in one embodiment, as shown in
S325: obtaining a plurality of first non-depth feature points, a plurality of second non-depth feature points, and a plurality of third non-depth feature points from the current frame fisheye feature points and the current frame ordinary feature points.
In one embodiment of the present application, the non-depth feature point screening conditions for screening the first non-depth feature point, the second non-depth feature point, and the third non-depth feature point are different.
In practical applications, the computer device may use different non-depth feature point screening conditions to screen a plurality of first non-depth feature points, a plurality of second non-depth feature points and a plurality of third non-depth feature points from the current frame fisheye feature points and the current frame ordinary feature points.
S326: constructing a non-depth residual function of each first non-depth feature point, a non-depth residual function of each second non-depth feature point, and a non-depth residual function of each third non-depth feature point.
Optionally, for any first non-depth feature point, the computer device may perform arithmetic operation and/or coordinate conversion processing on the current position coordinates of the first non-depth feature point to obtain the non-depth residual function of the first non-depth feature point.
Optionally, for any second non-depth feature point, the computer device may perform arithmetic operation and/or coordinate conversion processing on the current position coordinates of the second non-depth feature point to obtain the non-depth residual function of the second non-depth feature point.
Optionally, for any third non-depth feature point, the computer device may perform arithmetic operation and/or coordinate conversion processing on the current position coordinates of the third non-depth feature point to obtain the non-depth residual function of the third non-depth feature point.
In one embodiment of the present application, the construction methods of the non-depth residual function of each first non-depth feature point, the non-depth residual function of each second non-depth feature point, and the non-depth residual function of each third non-depth feature point may be the same or different.
S327: determining a non-depth information observation model according to the non-depth residual function of each first non-depth feature point, the non-depth residual function of each second non-depth feature point, and the non-depth residual function of each third non-depth feature point.
Among them, the computer device can perform arithmetic operations on the non-depth residual function of each first non-depth feature point to obtain a corresponding non-depth information observation model. In an embodiment of the present application, the second norm of the non-depth residual function of each first non-depth feature point can be calculated, and then the second norm of the non-depth residual function of each first non-depth feature point can be squared, and then the squares of the second norms of the non-depth residual function of each first non-depth feature point can be summed to obtain the corresponding non-depth information observation model.
At the same time, the computer device can perform arithmetic operations on the non-depth residual function of each second non-depth feature point to obtain a corresponding non-depth information observation model. In an embodiment of the present application, the second norm of the non-depth residual function of each second non-depth feature point can be calculated, and then the second norm of the non-depth residual function of each second non-depth feature point can be squared, and then the squares of the second norms of the non-depth residual function of each second non-depth feature point can be summed to obtain the corresponding non-depth information observation model.
In addition, the computer device can perform arithmetic operations on the non-depth residual function of each third non-depth feature point to obtain a corresponding non-depth information observation model. In an embodiment of the present application, the second norm of the non-depth residual function of each third non-depth feature point can be calculated, and then the second norm of the non-depth residual function of each third non-depth feature point can be squared, and then the squares of the second norms of the non-depth residual function of each third non-depth feature point can be summed to obtain the corresponding non-depth information observation model.
In an embodiment of the present application, each first non-depth feature point represents a current frame feature point without depth information that is in at least two current frame images and in at least two previous frame images; each second non-depth feature point represents a current frame feature point without depth information that is only in one current frame image and in at least two previous frame images; each third non-depth feature point represents a current frame feature point without depth information that is in at least two current frame images and in only one previous frame image; wherein the current frame image is either a current frame fisheye image or a current frame ordinary image; and the previous frame image is either a previous frame fisheye image or a previous frame ordinary image.
It should be noted that the camera system may include multiple fisheye cameras and multiple preset type of cameras. Correspondingly, the computer device may obtain multiple current frame fisheye images and multiple current frame ordinary images, and then obtain the current frame fisheye feature points of each current frame fisheye image and the current frame ordinary feature points of each current frame ordinary image, and then first filter out all current frame feature points without depth information from all current frame fisheye feature points and all current frame ordinary feature points.
Furthermore, based on all the acquired current frame feature points without depth information, for each current frame feature point without depth information, it can be determined whether the current frame feature point without depth information is in at least two current frame images and in at least two previous frame images. If so, the current frame feature point without depth information is determined as the first non-depth feature point.
For each current frame feature point without depth information, it is also possible to determine whether the current frame feature point without depth information is only in one current frame image and in at least two previous frame images. If so, the current frame feature point without depth information is determined as the second non-depth feature point.
For each current frame feature point without depth information, it can also be determined whether the current frame feature point without depth information is in at least two current frame images and only in one previous frame image. If so, the current frame feature point without depth information is determined as the third non-depth feature point.
The embodiment of the present application can obtain non-depth feature points with non-depth information, and then determine the corresponding non-depth information observation model according to each non-depth feature point, and use the non-depth information observation model as basic information to provide a reference basis for accurately determining the current pose information of the camera system.
The technical solution in some embodiments of the present application obtains depth feature points with depth information from the current frame fisheye feature points and the current frame ordinary feature points, determines the unit spherical coordinates of each depth feature point according to the current position coordinates of each depth feature point in the current frame image to which it belongs, determines the depth residual function of each depth feature point according to the current position coordinates of each depth feature point in the world coordinate system and the unit spherical coordinates of each depth feature point, and the external parameter between the current camera system coordinate system and the camera coordinate system corresponding to the current frame image to which each depth feature point belongs, and determines the depth information observation model according to the depth residual function of each depth feature point; the method can obtain depth feature points with depth information, and then determine the corresponding depth information observation model according to each depth feature point, and use the depth information observation model as basic information to provide a reference basis for accurately determining the current pose information of the camera system.
The process of constructing the non-depth residual function of each first non-depth feature point is described below. In one embodiment, as shown in
S326a: obtaining previous position coordinates of each first non-depth feature point on the corresponding previous frame image.
Optionally, the previous position coordinates of each first non-depth feature point on the corresponding previous frame image may be obtained according to the result of feature point extraction of each first non-depth feature point on the corresponding previous frame image.
It should be noted here that the previous position coordinates of the first non-depth feature point can be understood as the position coordinates of the first non-depth feature point on the previous frame image.
S326b: performing triangulation processing on the previous position coordinates of each first non-depth feature point to obtain the three-dimensional previous position coordinates of each first non-depth feature point in the world coordinate system; and performing triangulation processing on the current position coordinates of each first non-depth feature point to obtain the three-dimensional current position coordinates of each first non-depth feature point in the current camera system coordinate system.
In practical applications, the depth information of feature points can be obtained by triangulating the position coordinates of feature points.
In an embodiment of the present application, for any first non-depth feature point, the computer device can perform triangulation processing on the previous position coordinates of the first non-depth feature point to obtain the three-dimensional previous position coordinates of the first non-depth feature point in the world coordinate system; at the same time, the current position coordinates of the first non-depth feature point can also be triangulated to obtain the three-dimensional current position coordinates of the first non-depth feature point in the current camera system coordinate system.
It should be noted here that the first non-depth feature point can be in at least one current frame image, and naturally, the number of current position coordinates of the first non-depth feature point can be at least one; the first non-depth feature point can also be in at least one previous frame image, and naturally, the number of previous position coordinates of the first non-depth feature point can be at least one.
S326c: constructing a non-depth residual function of each first non-depth feature point according to the three-dimensional previous position coordinates of each first non-depth feature point in the world coordinate system and the three-dimensional current position coordinates of each first non-depth feature point in the current camera system coordinate system.
Among them, for any first non-depth feature point, arithmetic operations can be performed on the three-dimensional previous position coordinates of the first non-depth feature point in the world coordinate system and the three-dimensional current position coordinates of the first non-depth feature point in the current camera system coordinate system to obtain the non-depth residual function of the first non-depth feature point.
In one embodiment of the present application, referring to the previous example, if each first non-depth feature point is represented as P21 P22
. . .
P2b respectively, for the first non-depth feature point P2b, the depth residual function of ƒ2b(RBcW, tBcW) of the first non-depth feature point P2b can be constructed by the following formula (3).
Wherein, in formula (3) represents the three-dimensional previous position coordinates of the first non-depth feature point P2b in the world coordinate system,
represents the three-dimensional current position coordinates of the first non-depth feature point P2b in the current camera system coordinate system.
Furthermore, the non-depth information observation model determined according to the depth residual function of ƒ2b(RBcW, tBcW) of the first non-depth feature point P2b can be expressed by the following formula (4).
Meanwhile, in another embodiment, as shown in
S326d: obtaining the previous position coordinates of each second non-depth feature point on the corresponding previous frame image.
Optionally, the previous position coordinates of each second non-depth feature point on the corresponding previous frame image may be obtained according to the result of feature point extraction on each previous frame image.
It should be noted here that the previous position coordinates of the second non-depth feature point can be understood as the position coordinates of the second non-depth feature point on the previous frame image.
S326e: performing triangulation processing on the previous position coordinates of each second non-depth feature point to obtain the three-dimensional previous position coordinates of each second non-depth feature point in the world coordinate system.
In the embodiment of the present application, for any second non-depth feature point, the computer device can triangulate the previous position coordinates of the second non-depth feature point to obtain the three-dimensional previous position coordinates of the second non-depth feature point in the world coordinate system.
It should be noted here that the second non-depth feature point may be in at least one previous frame image, and naturally, the number of previous position coordinates of the second non-depth feature point may be at least one.
S326f: determining the unit sphere coordinates of each second non-depth feature point according to the current position coordinates of each second non-depth feature point in the current frame image to which it belongs.
Optionally, the current frame image to which the second non-depth feature point belongs may be a current frame fisheye image or a current frame ordinary image. Optionally, the current position coordinates of the second non-depth feature point may be understood as the position coordinates corresponding to the second non-depth feature point in the current frame image.
Among them, for any second non-depth feature point, the current position coordinates of the second non-depth feature point in the current frame image can be obtained based on the result of feature point extraction on the current frame image, and then the current position coordinates of the second non-depth feature point in the current frame image can be converted to obtain the unit sphere coordinates of the second non-depth feature point.
S326g: constructing a non-depth residual function of each second non-depth feature point based on the three-dimensional previous position coordinates of each second non-depth feature point in the world coordinate system, the unit sphere coordinates of each second non-depth feature point, and the external parameter between the current camera system coordinate system and the camera coordinate system corresponding to the current frame image to which each second non-depth feature point belongs.
Optionally, for any second non-depth feature point, arithmetic operations can be performed on the three-dimensional previous position coordinates of the second non-depth feature point in the world coordinate system, the unit sphere coordinates of the second non-depth feature point, and the external parameter between the current camera system coordinate system and the camera coordinate system corresponding to the current frame image to which the second non-depth feature point belongs to obtain the non-depth residual function of the second non-depth feature point.
In one embodiment of the present application, referring to the previous example, if each second non-depth feature point is represented as P31 P32
. . .
P3c respectively, for the second non-depth feature point P3c, the depth residual function of the ƒ3c(RBcW, tBcW) of the second non-depth feature point P3c can be constructed by the following formula (5).
Wherein, in formula (5) represents the three-dimensional previous position coordinates of the second non-depth feature point P3c in the world coordinate system,
represents the unit spherical coordinates of RBCn of the second non-depth feature point P3c, and tBCn represents the external parameter between the current camera system coordinate system and the camera coordinate system corresponding to the current frame image to which the second non-depth feature point P3c belongs.
Furthermore, the non-depth information observation model determined according to the depth residual function of the ƒ3c(RBcW, tBcW) of the second non-depth feature point P3c can be expressed by the following formula (6).
In addition, in one embodiment, as shown in
S326h: performing triangulation processing on the current position coordinates of each third non-depth feature point on the current frame image to obtain the three-dimensional current position coordinates of each third non-depth feature point in the current camera coordinate system.
Optionally, the current position coordinates of the third non-depth feature point may be understood as the position coordinates of the third non-depth feature point on the current frame image.
In actual applications, the computer device can obtain the current position coordinates of each third non-depth feature point on the current frame image to which it belongs, and then for any third non-depth feature point, the current position coordinates of the third non-depth feature point on the current frame image to which it belongs can be triangulated to obtain the three-dimensional current position coordinates of the third non-depth feature point in the current camera coordinate system.
S326i: determining the unit sphere coordinates of each third non-depth feature point according to the previous position coordinates of each third non-depth feature point in the corresponding previous frame image.
Optionally, the previous frame image to which the third non-depth feature point belongs can be a previous frame fisheye image or a previous frame ordinary image. Optionally, the previous position coordinates of the third non-depth feature point can be understood as the position coordinates corresponding to the third non-depth feature point in the previous frame image.
Among them, for any third non-depth feature point, the previous position coordinates of the third non-depth feature point in the previous frame image can be obtained based on the result of feature point extraction on the previous frame image, and then the previous position coordinates of the third non-depth feature point in the previous frame image can be converted to obtain the unit sphere coordinates of the third non-depth feature point.
S326j: constructing a non-depth residual function of each third non-depth feature point based on the three-dimensional current position coordinates of each third non-depth feature point in the current camera coordinate system, the unit sphere coordinates of each third non-depth feature point, transformation relationship between the world coordinate system and the previous camera system coordinate system, and an external parameter between the current camera system coordinate system and the camera coordinate system corresponding to the previous frame image to which each third non-depth feature point belongs.
Optionally, for any third non-depth feature point, arithmetic operations can be performed on the three-dimensional current position coordinates of the third non-depth feature point in the current camera coordinate system, the unit sphere coordinates of the third non-depth feature point, the transformation relationship between the world coordinate system and the previous camera system coordinate system, and the external parameter between the current camera system coordinate system and the camera coordinate system corresponding to the previous frame image to which the third non-depth feature point belongs, to obtain the non-depth residual function of the third non-depth feature point.
In one embodiment of the present application, referring to the previous example, if each third non-depth feature point is represented as P41 P42
. . .
P4d, for the third non-depth feature point P4d, the depth residual function of ƒ4d(RBcW, tBcW) of the third non-depth feature point P4d can be constructed by the following formula (7).
Among them, in formula (7), represents the three-dimensional current position coordinates of the third non-depth feature point P4d in the current camera coordinate system,
represents the unit sphere coordinates of the third non-depth feature point P4d, RWBl and tWBl represent the transformation relationship between the world coordinate system and the previous camera system coordinate system, RBCz and tBCz respectively represent the external parameter between the current camera system coordinate system and the camera coordinate system corresponding to the previous frame image to which the third non-depth feature point P4d belongs.
Further, the non-depth information observation model determined according to the depth residual function of the ƒ4d(RBcW, tBcW) of the third non-depth feature point P4d can be expressed by the following formula (8).
It should be noted here that, in one embodiment of the present application, based on the depth information observation model F1 corresponding to the constructed depth feature points, the non-depth information observation model F2 corresponding to the first non-depth feature points, the non-depth information observation model F3 corresponding to the second non-depth feature points, and the non-depth information observation model F4 corresponding to the third non-depth feature points, the current pose optimization function of the camera system constructed can be expressed by formula (9).
In addition, it should be noted that the computer device can obtain the previous pose information of the camera system, and perform triangulation processing on all the current frame feature points (the current frame fisheye feature points and the current frame ordinary feature points) according to the current pose information of the camera system and the previous pose information of the camera system, and obtain the three-dimensional position coordinates of each current frame feature point in the world coordinate system, in preparation for determining next pose information of the camera system at the next moment.
At the same time, feature point post-processing (i.e., feature point supplementation) can be performed on the current frame ordinary image and the current frame fisheye image. Optionally, the ordinary feature points of the current frame ordinary image are supplemented, and the fisheye feature points of the current frame fisheye image are supplemented to prepare for determining the next pose information of the camera system at the next moment.
Among them, for the current frame fisheye image Fƒkc, the number of fisheye feature points that need to be supplemented is Aƒk, and the number of current frame fisheye feature points of currently tracked is Tƒk. Among them, if Tƒk is smaller than Thƒ, Aƒk=Thƒ−Tƒk; if Tƒk=Thƒ, Aƒk=0.
Optionally, each current frame fisheye image can be sorted. For the current frame fisheye image Fƒ1c of sorted number 1, a feature point extraction algorithm can be used to extract Aƒ1 feature points from the current frame fisheye image Fƒ1c as the fisheye feature points supplemented by the current frame fisheye image Fƒ1c. For the current frame fisheye image of sorted number 2 Fƒ2c, feature point tracking can be performed on the current frame fisheye image Fƒ2c based on the fisheye feature points supplemented by the current frame fisheye image Fƒ1c. If the number of newly tracked fisheye feature points (i.e., supplemented fisheye feature points) of the current frame fisheye image Fƒ2c is equal to Aƒk, then the tracking of the current frame fisheye image Fƒ2c is terminated. If the number of newly tracked fisheye feature points of the current frame fisheye image Fƒ2c is less than Aƒk, then Aƒk−rƒ2 feature points are extracted from the current frame fisheye image Fƒ2c, where rƒ2 represents the number of newly tracked fisheye feature points of the current frame fisheye image Fƒ2c.
If for the current fisheye image Fƒ4c of sorted number 4, the feature points of the current fisheye image Fƒ4c can be tracked according to the current fisheye image Fƒ1c, the current fisheye image Fƒ2c and the fisheye feature points supplemented by the current fisheye image Fƒ2c. If the number of newly tracked fisheye feature points (i.e., supplemented fisheye feature points) of the current fisheye image Fƒ4c is equal to Aƒk, then the tracking of current fisheye image Fƒ4c is terminated. If the number of newly tracked fisheye feature points of the current fisheye image Fƒ4c is less than Aƒk, then Aƒk−rƒ4 feature points are extracted from the current fisheye image Fƒ4c, where rƒ4 represents the number of newly tracked fisheye feature points of the current fisheye image Fƒrc. Similarly, for any current fisheye image of later sorted numbers, when supplementing feature points, the feature points of the current fisheye image can be tracked according to the supplemented fisheye feature points of all current fisheye images before the current fisheye image, until the number of supplemented fisheye feature points is equal to Aƒk.
In addition, for the current frame ordinary image Fpoc, the number of ordinary feature points that need to be supplemented is Apo, and the number of ordinary feature points currently tracked is Tpo. Among them, if Tpo is smaller than Thp, Apo=Thp−Tpo; if Tpo=Thp, Apo=0.
For any current frame ordinary image, feature point tracking can be performed on the current frame ordinary image according to the supplemented fisheye feature points of each current frame fisheye image to obtain newly supplemented ordinary feature points of the current frame ordinary image. If the number of newly supplemented ordinary feature points of the current frame ordinary image is equal to the number of ordinary feature points that need to be supplemented Apo in the current frame ordinary image, the tracking of the current frame ordinary image is terminated. If the number of newly tracked ordinary feature points of the current frame ordinary image is less than Apo, then Apo−rp feature points are extracted from the current frame ordinary image, where rp represents the number of newly tracked ordinary feature points of the current frame ordinary image.
Exemplarily, a camera system includes a fisheye camera and a preset type of camera, the fisheye camera captures fisheye video (including the first frame fisheye image 1, the second frame fisheye image 2, the third frame fisheye image 3, the fourth frame fisheye image 4 and the fifth frame fisheye image 5), the preset type of camera captures ordinary video (including the first frame ordinary image 1, the second frame ordinary image 2, the third frame ordinary image 3, the fourth frame ordinary image 4 and the fifth frame ordinary image 5), and the fisheye video and the ordinary video have the same time period, as shown in
In some embodiments, the above initialization process includes: obtaining initial pose information of the camera system in the first frame, and extracting feature points of fisheye image 1 and ordinary image 1 to obtain fisheye feature points 1 and ordinary feature points 2, respectively. At this time, it is determined that the initialization is completed; in addition, in the case that the feature point extraction of fisheye image 1 and ordinary image 1 fails, the initialization can be continued, and the feature point extraction of fisheye image 2 and ordinary image 2 continues. If the feature point extraction fails, the feature point extraction continues to the next frame of fisheye image and the next frame of ordinary image continues, until the feature point extraction is successful, and the initialization is determined to be completed. It should be noted here that if the feature point extraction of fisheye image 3 and ordinary image 3 is successful, it indicates that the initialization of the third frame is successful.
Furthermore, based on the successful initialization mentioned above, the third frame fisheye image 3 can be used to perform inter-frame feature tracking on the fourth frame fisheye image 4 to obtain fisheye feature points 4 of the fisheye image 4, and the third frame ordinary image 3 can be used to perform inter-frame feature tracking on the fourth frame ordinary image 4 to obtain ordinary feature points 4 of the ordinary image 4, and then the fisheye feature points 4, fisheye feature points 3, ordinary feature points 4 and ordinary feature points 3 are subjected to pose estimation processing to obtain the pose information of the camera system in the fourth frame.
Next, feature points post-processing is performed on the ordinary image 4 and the fisheye image 4 to obtain ordinary feature points 41 and fisheye feature points 41 respectively, and then inter-frame feature point tracking is performed after successful initialization, that is, inter-frame feature point tracking is performed on the fifth frame ordinary image 5 through the ordinary feature points 41, and inter-frame feature point tracking is performed on the fifth frame fisheye image 5 through the fisheye feature points 41, and then the pose estimation processing is performed on the camera system of the fifth frame to obtain the pose information of the camera system in the fifth frame. If both the ordinary image and the fisheye image include multiple frames, other frames can be processed according to the above steps to obtain the pose information of the camera system in each frame.
It should be noted here that if both the fisheye video and the ordinary video include only one frame of image, it is sufficient to obtain only the initial pose information of the camera system, and there is no need to extract feature points from the first frame image.
The technical solution in one embodiment of the present application can respectively construct non-depth residual functions of different non-depth feature points to provide basic information for constructing different non-depth information observation models; at the same time, the method does not require complex algorithms to be implemented, and the process is relatively simple, thereby improving the construction speed of the non-depth residual function; at the same time, the method can obtain the position coordinates of the non-depth feature points corresponding to the depth information, and on this basis, the method can finally determine the pose of the camera system when facing scenes such as weak textures and rapid camera motion, and still show high accuracy and good robustness.
In one embodiment, the present application also provides a method for determining the pose of a camera system, the method comprising the following process:
Alternatively, if the number of feature points tracked on the previous frame fisheye image is less than the preset number of fisheye features, the number of compensating feature points of the previous frame fisheye image is determined based on the number of feature points tracked on the previous frame fisheye image and the preset number of fisheye features; based on the number of compensating feature points of the previous frame fisheye image, feature point compensation extraction is performed on the previous frame fisheye image to obtain the previous frame fisheye feature points of the previous frame fisheye image.
Alternatively, if the number of candidate ordinary feature points of the previous frame ordinary image is less than the preset number of ordinary features, the number of compensating feature points of the previous frame ordinary image is determined based on the total number of candidate ordinary feature points of the previous frame ordinary image and the preset number of ordinary features; based on the number of compensating feature points of the previous frame ordinary image, feature point compensation extraction is performed on the previous frame ordinary image; based on the feature point compensation extraction result of the previous frame ordinary image and the candidate ordinary feature points, the previous frame ordinary feature points of the previous frame ordinary image are determined.
It should be understood that, although the steps in the flowcharts involved in the above embodiments are displayed in sequence according to the indication of the arrows, these steps are not necessarily executed in sequence according to the order indicated by the arrows. Unless there is a clear explanation in this disclosure, the execution of these steps is not strictly limited in order, and these steps can be executed in other orders. Moreover, at least a part of the steps in the flowcharts involved in the above embodiments may include multiple steps or multiple stages, and these steps or stages are not necessarily executed at the same time, but can be executed at different times, and the execution order of these steps or stages is not necessarily carried out in sequence, but can be executed in turn or alternately with other steps or at least a part of the steps or stages in other steps.
Based on the same inventive concept, one embodiment of the present application also provides a camera system pose determination device for implementing the above-mentioned camera system pose determination method. The implementation solution provided by the device to solve the problem is similar to the implementation solution recorded in the above-mentioned method, so the specific limitations in the embodiments of the pose determination device for one or more camera systems provided below can refer to the limitations of the camera system pose determination method above, and will not be repeated here.
In one embodiment,
The device for determining the pose of a camera system provided in the embodiment of the present application can be used to execute the technical solution in the embodiment of the method for determining the pose of a camera system described above in the present application. The implementation principle and technical effects are similar and will not be repeated here.
In one embodiment, the feature point tracking module 12 includes: a first acquisition unit or structure and a first feature point tracking unit or structure, wherein:
The device for determining the pose of a camera system provided in the embodiment of the present application can be used to execute the technical solution in the embodiment of the method for determining the pose of a camera system described above in the present application. The implementation principle and technical effects are similar and will not be repeated here.
In one embodiment, the camera system includes at least one fisheye camera. If the historical frame fisheye image includes a previous frame fisheye image of the current frame fisheye image, and the previous frame fisheye image is the initial frame fisheye image, the first acquisition unit includes: a first acquisition subunit, a first extraction subunit and a second extraction subunit, wherein:
The device for determining the pose of a camera system provided in one embodiment of the present application can be configured to execute the technical solution in the embodiment of the method for determining the pose of a camera system described above in the present application. The implementation principle and technical effects are similar and will not be repeated here.
In one embodiment, the second extraction subunit is specifically used for:
The device for determining the pose of a camera system provided in the embodiment of the present application can be used to execute the technical solution in the embodiment of the method for determining the pose of a camera system described above in the present application. The implementation principle and technical effects are similar and will not be repeated here.
In one embodiment, the second extraction subunit is further configured to:
The device for determining the pose of a camera system provided in the embodiment of the present application can be used to execute the technical solution in the embodiment of the method for determining the pose of a camera system described above in the present application. The implementation principle and technical effects are similar and will not be repeated here.
In one embodiment, the feature point tracking module 12 further includes: a second acquisition unit, a second feature point tracking unit and a first determination unit, wherein:
The device for determining the pose of a camera system provided in the embodiment of the present application can be used to execute the technical solution in the embodiment of the method for determining the pose of a camera system described above in the present application. The implementation principle and technical effects are similar and will not be repeated here.
In one embodiment, the camera system includes at least one fisheye camera, if the historical frame fisheye image includes a previous frame fisheye image of the current frame fisheye image, the historical frame ordinary image includes a previous frame ordinary image of the current frame ordinary image, and the previous frame fisheye image is an initial frame fisheye image, and the previous frame ordinary image is an initial frame ordinary image, then the second acquisition unit includes: a second acquisition subunit, a first tracking subunit and a first determination subunit, wherein:
The device for determining the pose of a camera system provided in the embodiment of the present application can be used to execute the technical solution in the embodiment of the method for determining the pose of a camera system described above in the present application. The implementation principle and technical effects are similar and will not be repeated here.
In one embodiment, the second acquisition unit further includes: a second determination subunit, a feature point extraction subunit and a third determination subunit, wherein:
The device for determining the pose of a camera system provided in the embodiment of the present application can be used to execute the technical solution in the embodiment of the method for determining the pose of a camera system described above in the present application. The implementation principle and technical effects are similar and will not be repeated here.
In one embodiment, the feature point tracking module 12 further includes: a third feature point tracking unit, wherein:
The device for determining the pose of a camera system provided in the embodiment of the present application can be used to execute the technical solution in the embodiment of the method for determining the pose of a camera system described above in the present application. The implementation principle and technical effects are similar and will not be repeated here.
In one embodiment, the determination module 13 includes: a feature point acquisition unit, an observation model construction unit, an optimization function construction unit and a pose information determination unit, wherein:
The device for determining the pose of a camera system provided in the embodiment of the present application can be used to execute the technical solution in the embodiment of the method for determining the pose of a camera system described above in the present application. The implementation principle and technical effects are similar and will not be repeated here.
In one embodiment, the current pose observation model includes a depth information observation model and a non-depth information observation model; the observation model construction unit includes: a first construction subunit and a second construction subunit, wherein:
The device for determining the pose of a camera system provided in the embodiment of the present application can be used to execute the technical solution in the embodiment of the method for determining the pose of a camera system described above in the present application. The implementation principle and technical effects are similar and will not be repeated here.
In one embodiment, the first construction subunit is specifically used for:
The depth information observation model is determined according to the depth residual function of each depth feature point.
The device for determining the pose of a camera system provided in the embodiment of the present application can be used to execute the technical solution in the embodiment of the method for determining the pose of a camera system described above in the present application. The implementation principle and technical effects are similar and will not be repeated here.
In one embodiment, the second construction subunit includes: a non-depth feature point acquisition subunit, a residual function construction subunit and an observation model determination subunit, wherein:
The device for determining the pose of a camera system provided in the embodiment of the present application can be used to execute the technical solution in the embodiment of the method for determining the pose of a camera system described above in the present application. The implementation principle and technical effects are similar and will not be repeated here.
In one embodiment, the residual function construction subunit is specifically used for:
The device for determining the pose of a camera system provided in the embodiment of the present application can be used to execute the technical solution in the embodiment of the method for determining the pose of a camera system described above in the present application. The implementation principle and technical effects are similar and will not be repeated here.
In one embodiment, the residual function construction subunit is further used for:
The device for determining the pose of a camera system provided in the embodiment of the present application can be used to execute the technical solution in the embodiment of the method for determining the pose of a camera system described above in the present application. The implementation principle and technical effects are similar and will not be repeated here.
In one embodiment, the residual function construction subunit is further used for:
For the specific definition of the pose determination device of the camera system, please refer to the definition of the pose determination method of the camera system above, which will not be repeated here. Each module in the pose determination device of the above-mentioned camera system can be implemented in whole or in part by software, hardware and a combination thereof. The above-mentioned modules can be embedded in or independent of the processor in the computer device in the form of hardware, or can be stored in the memory of the computer device in the form of software, so that the processor can call and execute the operations corresponding to the above modules.
In one embodiment, a movable platform is further provided. Referring to
When the at least one processor executes the computer program, the steps of the method in any of the above embodiments are implemented.
The movable platform provided in one embodiment of the present application can be used to execute the technical solution in the embodiment of the method for determining the pose of the camera system mentioned above in the present application. Its implementation principle and technical effect are similar and will not be repeated here.
In one embodiment, a computer device is also provided, and the internal structural diagram of the computer device can be shown in
Those skilled in the art will understand that the structure shown in
In one embodiment, a computer device is provided, including at least one memory and at least one processor. The at least one memory stores a computer program, and the at least one processor implements the steps of the method in any of the above embodiments when executing the computer program.
In one embodiment, a computer-readable storage medium is further provided, on which a computer program is stored. When the computer program is executed by a processor, the steps of the method in any of the above embodiments are implemented.
In one embodiment, a computer program product is also provided, including a computer program, which implements the steps of the method in any of the above embodiments when executed by a processor.
In one embodiment, the pose information of the camera system can be used for anti-shake. The pose information of the camera system can be used to identify image jitter and correct the image jitter to achieve anti-shake processing. For example, the position of the image frame can be adjusted based on the pose information of the camera system to reduce or eliminate the impact of jitter to obtain a stable image or video output.
In one embodiment, the pose information of the camera system can be used for navigation. In the navigation system, the pose information of the camera system helps the device (such as a drone) understand its orientation in the environment to achieve navigation. For example, based on the pose information of the camera system, the current position of the device can be calculated in real time compared to the pre-built map, thereby generating an accurate path and achieving navigation.
In one embodiment, based on the pose information of the camera system carried by the drone, the drone can be assisted in completing different aspects of control such as perception (such as tracking, obstacle avoidance, etc.), flight control, and flight planning.
Those of ordinary skill in the art can understand that all or part of the processes in the above-mentioned embodiment methods can be completed by instructing the relevant hardware through a computer program, and the computer program can be stored in a non-volatile computer-readable storage medium. When the computer program is executed, it can include the processes of the embodiments of the above-mentioned methods. Among them, any reference to memory, storage, database or other media used in the embodiments provided in this application can include at least one of non-volatile and volatile memory. Non-volatile memory can include read-only memory (ROM), tape, floppy disk, flash memory or optical memory, etc. Volatile memory can include random access memory (RAM) or external cache memory. As an illustration and not limitation, RAM can be in various forms, such as static random access memory (SRAM) or dynamic random access memory (DRAM).
The technical features of the above embodiments may be combined arbitrarily. To make the description concise, not all possible combinations of the technical features in the above embodiments are described. However, as long as there is no contradiction in the combination of these technical features, they should be considered to be within the scope of this specification.
The above-mentioned embodiments only express several implementation methods of the present application, and the descriptions thereof are relatively specific and detailed, but they cannot be understood as limiting the scope of the invention patent. It should be pointed out that, for a person of ordinary skill in the art, several variations and improvements can be made without departing from the concept of the present application, and these all belong to the protection scope of the present application. Therefore, the protection scope of the patent of the present application shall be subject to the attached claims.
| Number | Date | Country | Kind |
|---|---|---|---|
| 202311391909X | Oct 2023 | CN | national |