Method for corresponding, evolving and tracking feature points in three-dimensional space

Description

BRIEF DESCRIPTION OF THE DRAWINGS

The patent or application file contains at least one drawing executed in color. Copies of this patent or patent application publication with color drawing(s) will be provided by the office upon request and payment of the necessary fee.

They will be full comprehensibility of the present invention from the detailed description given herein below for illustration only, and thus are not limitative of the present invention, and wherein:

FIG. 1 is a schematic view of the general operation according to the present invention;

FIG. 2 is a schematic view of the operation of a time series analysis according to the present invention;

FIG. 3 is a flow chart of a method according to the present invention;

FIGS. 4A to 4E are schematic views of a feature point information evolving process according to the present invention;

FIG. 5A is a three-dimensional scene reestablished by feature point information generated by the conventional art; and

FIG. 5B is a three-dimensional scene reestablished by feature point information generated by the present invention.

DETAILED DESCRIPTION OF THE INVENTION

Referring to FIG. 1, a basic schematic view of a general operation of the present invention is shown. As shown in the figure, the present invention is implemented by two stages of process comprising a time series analysis 110 and a feature evolving 120. The method of the present invention is accomplished by executing a computer software program on a computer platform. As shown by the dashed line, the method is accomplished by a feature point information generating system 100 stored as the computer software program. The input data of the whole feature point information generating system 100 is an object. The object can be an image sequence 10 including a series of image frames being discontinuous in time or a group of space points having three-dimensional track or motion mode on the time axis, and the output data is three-dimensional feature point information 20 of the three-dimensional scene. The operating steps of the method of the present invention are illustrated on the premise that the object of the input data is the image sequence 10, and the difference in the operating steps caused by the difference in the input data will be given later on.

FIG. 2 is a schematic view of the operation of the time series analysis 110 in the method according to the present invention. The time series analysis 110 mainly comprises three processes, namely state initializing process 111, system modeling 112 and state updating process 113. The state initializing process 111 is responsible for initializing the feature point information of the input object (e.g., the image sequence 10) on each time point, and determining the feature point information 30 (including the feature point and the three-dimensional state) of the image frame at the time point. Taking the image sequence 10 as an example, after being initialized, the feature point information of the first image frame in the input image sequence 10 will become feature point information including the feature point group and the three-dimensional state collection corresponding to the feature point group. The system modeling 112 is responsible for establishing an appropriate analysis model to match with the input object (i.e., the image sequence 10), which includes a system model and a state description expression (as shown in the part marked with 40). This part is set by the data input from outside according to motion mode or other attributes of the practically input image sequence 10, and also can be properly adjusted according to the motion mode or other attributes of different input objects. After the initialization is accomplished by the above steps, the input image sequence 10 is illustrated through the analysis model in the system modeling program 112 according to the property during shooting. Finally, the state updating process 113 is used to update the state of each feature point. This part operates together with the part of feature evolving process 120 in FIG. 1, wherein the feature point information of each image frame after being initialized and described by the analysis model is transferred into the feature evolving process 120 for screening the feature points, and then the feature point information are updated according to the screening result (i.e., the reliable feature point information 50), until the process on all input image frames is finished, and finally the feature point group having a strong corresponding relation is remained for being used for reconstructing an accurate three-dimensional scene. In the stage of feature evolving 120, it must be emphasized that the screening (appearing or disappearing) of the feature points is a process like the biologic evolving, wherein the three-dimensional state of the feature points passes down in a recursion means process, thus being different from a simple screening good from bad.

The detailed method flow is illustrated with reference to FIGS. 3 and 4A to 4E, which is illustrated on the premise that the input object is the image sequence 10.

First, when each image frame in the image sequence 10 is input according to the time point, the part of the time series analysis 110 executes the processes described in FIG. 2, and the analysis model adopted in the time series analysis is mainly the Kalman Filter time series analysis model:

(1) The state initializing process 110 mainly includes two situations. The first executing situation of the initializing process is executed as that, when a first image frame is input, i.e., when time t=0, the initializing process directly generates a feature point group {Y₀} of the first image frame (Step 200), and generates a three-dimensional state collection {X₀} corresponding to the feature point group (Step 210) as the feature point information thereof. As shown by the triangular pattern in FIG. 4, a three-dimensional state collection {X₀} 400 corresponding to the time t=0 is directly generated by the feature point group {Y₀} 300 of the first image frame at the time t−0.

{X₀} is obtained by initializing three-dimensional states on each feature point y₀of the {Y₀} through the Kalman Filter time series analysis model. The three-dimensional state x₀includes the horizontal position, vertical position and depth position of the feature point in the three-dimensional space. The depth position can be generated by first computing {Y₀} and {Y₁} through the three-dimensional vision manner, and then projecting through a camera model. {Y₁} is obtained by finding corresponding point {Y₀} in the first image frame.

Another situation of the initializing process is executed as that, when the image frame is input not for the first time (i.e., when the input image frame is not the first image frame), i.e., the image frame input at the time t, since the reliable feature point information 50 (i.e., the feature point information including the feature point group and the three-dimensional state collection) remained after the operation of the former image frame is updated into the feature point information of the former image frame, the next input image frame takes the updated feature point information as the initialization result (Step 260). This situation is shown as the triangular pattern in FIG. 4B.

(2) After the state initializing process is finished, the feature point information of the image frame being processed is transferred into the system modeling 112 for predicting the state, and the predicting of the feature point information ({Y_t+1}, {x_t+1}) of the initialized feature point information {Y_t} and the three-dimensional state {x_t} at the next time point is performed through the established analysis model (Step 220). As described above, the analysis model used in the embodiment of the present invention is the Kalman Filter time series analysis model, wherein the descriptions on Y_tand X_tare represented by the following expressions:

X
_t+1
=F
_t
X
_t
+U
_t
+Q
_t;

Y
_t
=H
_t
X
_t
+R
_t;

F_tsimulates the linear variation process of the state X_talong with the time, U_tis a known translation amount at the state of X_t, H_tsimulates the relationship between X_tand Y_t, Q_tand R_tsimulate the interference of noise, wherein Q_talso can be represented as Q_t˜N(0, q_t), R_talso can be represented as R_t˜N (0, r_t). Therefore, the prediction value of each X_t+1is represented as

$X_{\overline{t + 1}} \sim N ({\hat{X}}_{\overline{t + 1}}, P_{\overline{t + 1}}),$

wherein

${\hat{X}}_{\overline{t + 1}} = F_{t} {\hat{X}}_{t} + U_{t}, P_{\overline{t + 1}} = F_{t} P_{t} F_{t}^{T} + Q_{t} .$

The part of y_t+1can be obtained from each y_tin {Y_t} through a feature matching method.

As shown in FIG. 4B, the circular pattern shown in the upper portion is a predicted feature point group {Y_t+1} 310 at the time t+1, and the triangular pattern shown in the lower portion is the predicted three-dimensional state collection {X_t+1} 410.

(3) After the state is predicted, the three-dimensional state collection {X_t+1} in the prediction result must be properly corrected (Step 230). The above can be mainly achieved by the correcting model existing in the Kalman Filter time series analysis model, and the correcting model is represented by the following expression.

X
_t+1
˜N({circumflex over (X)}_t+1, P_t+1);

wherein

${\hat{X}}_{t + 1} = {\hat{X}}_{\overline{t} + 1} + K_{t + 1} E_{t + 1}, P_{t + 1} = (I - K_{t + 1} H_{t + 1}) P_{\overline{t + 1}} .$

In the Kalman Filter time series analysis model, the error E and gain K are

respectively defined as E_t+1=(Y_t+1−H_t+1{circumflex over (X)}_t+1) and

$K_{t + 1} = \frac{P_{t \mp 1} H_{t + 1}^{T}}{H_{t + 1} P_{t \mp 1} H_{t + 1}^{T} + R_{t + 1}} .$

Therefore, after the process, a shift is generated on the prediction, and the three-dimensional state collection is corrected as {{circumflex over (X)}_t+1}, and is reset as {X_t+1}.

As shown in FIG. 4B, the circular pattern shown in the lower portion is the predicted three-dimensional state collection {X_t+1} generated after being corrected at the time t+1, with the position being corrected from the triangular pattern shown in the lower portion before being corrected to the position of the circular pattern. It includes predicted feature point group {Y_t+1} 310 at the time t+1 and the predicted three-dimensional state collection {X_t+1} 410 at the time t+1.

The above correcting model can be properly adjusted according to the employed analysis model, and further can be properly adjusted according to the motion mode of the input image sequence 10 in the three-dimensional space. The related adjusting method differs according to the different analysis models, but the present invention holds the design flexibility of the adjustment of this part.

(4) The corrected feature point information ({Y_t+1}, {X_t+1}) is transferred from the time series analysis 110 stage to the feature evolving 120 stage. In this stage, the reliable feature point information ({{tilde over (Y)}_t+1}, {{tilde over (X)}_t+1}) to be remained is screened by the evolving operation on the feature points (Step 240), which mainly includes two parts of screening given below in detail.

The first part is generating a new feature point, which includes the following steps.

(a) The new feature point is found out according to the method of corresponding the feature points between {Y_t} and {Y_t+1}, and added into {{tilde over (Y)}_t+1}.

(b) A weight

$w_{x_{t + 1}^{'}}$

is set for initializing the three-dimensional state collection of the collection {{tilde over (X)}_t+1}, wherein the weight is determined by the state value of the neighboring feature points. The weight value is represented by the existing time of neighboring feature points in the whole image sequence 10, or represented by the distance from the neighboring feature points.

The definition of the neighboring feature points is represented by the following expression:

X′
_t+1
={x
_t+1
εX
_t+1
|∥y
_t+1
−{tilde over (y)}
_t+1∥<η};

And, the expression of the weight

$w_{x_{t + 1}^{'}}$

is:

$w_{x_{t + 1}^{'}} = \frac{Age (x_{t + 1}^{'})}{ y_{t + 1}^{'} - {\tilde{y}}_{t + 1}  \sum_{i = t + 1 - Age (x_{t + 1}^{'})}^{t + 1} α_{t + 1 - i} {\langle E_{i} (x_{i}^{'}) \rangle}^{2}};$

Therefore, the three-dimensional state of each x,+, after being initialized can be further represented by the following expression:

${\tilde{x}}_{t + 1} = \frac{\sum_{x_{t + 1}^{'} \in X_{t + 1}^{'}} w_{x_{t + 1}^{'}} x_{t + 1}^{'} + β X_{0}}{\sum_{x_{t + 1}^{'} \in X_{t + 1}^{'}} w_{x_{t + 1}^{'}}} .$

After the process of generating the new feature point, the feature point information is shown by the hollow circular pattern connected by dashed line in FIG. 4C, which includes the newly added feature point group {{tilde over (Y)}_t+1} 320 at the time t+1 and the newly added three-dimensional state collection {{tilde over (X)}_t+1} 420 at the time t+1.

The second part is deleting the feature point, which includes the following steps.

(c) The feature point generating an error larger than the threshold during the feature matching process is deleted from the existing feature point collection {{tilde over (Y)}_t+1}. This part may generate errors in the feature matching process when the feature point is found out by the feature matching method, thus the feature points with large errors must be deleted.

(c) The feature point generating an error larger than the threshold during the feature corresponding is deleted from the newly generated feature point collection {{tilde over (Y)}_t+1}. This part is the same as the former process, which is used to delete the feature points generated by the feature matching error.

(c) and (d) mainly define P_t() as a rectangular region taking as the center at the time t, and thus at the time t+1, the feature matching error E_t+1(y_t, y_t+1) of each feature point y_t+1in the existing feature point collection {Y_t+1} is defined as the following expression:

E
_t+1(y_t, y_t+1)=∥P_t(y_t)−P_t+1(y_t+1)∥;

and the feature corresponding error E_t+1({tilde over (y)}_t, {tilde over (y)}_t+1) of each feature point {tilde over (y)}_t+1, in the newly generated feature point collection {{tilde over (Y)}_t+1} is defined as the following expression:

E
_t+1({tilde over (y)}_t+1, {tilde over (y)}_t)=∥P_t+1({tilde over (y)}_t+1)−P_t({tilde over (y)}_t)∥;

and when E_t+1(y_t, y_t+1) and E_t+1({tilde over (y)}_t+1, {tilde over (y)}_t) are respectively larger than the preset threshold, the feature points y_t+1and {tilde over (y)}_t+1are deleted.

(e) The feature point with an error calculated by the system model analysis during the prediction of {X_t+1} larger than the threshold is to be deleted. This part is mainly directed to delete the feature point with large error when the three-dimensional state is transferred through the Kalman Filter time series analysis model.

The error is defined as the difference between the state {circumflex over (X)}_t+1(obtained by correcting each x_t+1in the collection {X_t+1} through the Kalman Filter time series analysis model) multiplying H_t+1and each y₁in the feature point collection {Y_t}, which can be represented as

$E_{t + 1} (y_{t}, y_{t + 1}) = (Y_{t + 1} - H_{t + 1} {\hat{X}}_{t + 1}^{-}) .$

Likewise, when the error exceeds the preset threshold, the corresponding feature point is deleted.

After the feature point is deleted, the deleted feature point information is represented by connecting the part marked with “X” by dashed lines, and the survival feature point includes the feature point group {{tilde over (Y)}_t+1} 330 after the deletion at the time t+1 and the three-dimensional state collection {{tilde over (X)}_t+1} 430 after the deletion at the time t+.

After the Step 240, the generated feature point information ({{tilde over (Y)}_t+1}, {{tilde over (X)}_t+1}) is the so-called reliable feature point information, and the feature point groups in the feature point information all have a strong corresponding relationship. At this point, whether other images in the image sequence 10 still need to be processed or not is determined (Step 250). If yes, it is necessary to return to the Step 220 to execute in a recursion manner, however, before that, the step 260 of transferring the newly obtained reliable feature point information back to the stage of time series analysis 110 for being updated must be performed. That is, during the state updating process 113, {Y_t+1}={Y_t+1}+{{tilde over (Y)}_t+1} and {X_t+1}={X_t+1}+{{tilde over (X)}_t+1} are taken as the state ({Y_t+1},{X_t+1}) of the reliable feature point information 50 when a next image frame in the image sequence 10 is processed.

Now the reliable feature point information 50 ({Y_t+1},{X_t+1}) is the triangular pattern connected by dashed lines as in FIG. 4E, which includes the updated feature point group {Y_t+1} 340 at the time t+1 and updated three-dimensional state collection {X_t+1} 440 at the time t+1.

On the contrary, if all the image frames in the image sequence 10 are processed, the finally remained reliable feature point information 50 (i.e., the so-called three-dimensional feature point information) is output to the state updating process 113 (Step 270), and the {Y_t+1}={Y_t+1}+{{tilde over (Y)}_t+1} and {X_t+1}={X_t+1}+{{tilde over (X)}_t+1} are taken as the final three-dimensional scene information of the whole image sequence 10.

As mentioned before, the input object is a group of space points having three-dimensional track or motion on the time axis. Since the space point itself has the content of the three-dimensional feature point information, when the above-mentioned state initializing process 111 executes the first situation of the initializing process, the process of generating three-dimensional state collection (i.e., the part of Step 210) executed when the first space point is input can be omitted.

The reconstruction result of-the three-dimensional scene will become more accurate by means of the operation of the present invention-as demonstrated in the comparison between FIG. 5A and FIG. 5B. FIG. 5A is a three-dimensional scene established according to the feature point information generated by the conventional methods, and FIG. 5B is a three-dimensional scene reconstructed according to the feature point information generated by the technique of the present invention. It can be known from the marks (500 and 510) made in FIGS. 5A and 5B, the three-dimensional scene 510 generated according to the present invention has a much higher accuracy than the three-dimensional scene 500 reconstructed according to the conventional art. It is noted that at the position of the marks, the conventional methods tend to generate the feature point corresponding error resulting in the mistake of the three-dimensional scene information, so that an abnormal mapping phenomenon (e.g., the concavo-convex part of the mark 500) is generated on the texture mapping model. Since the present invention uses the time series analysis 110 to process and analyze the feature point information, the feature point information having a strong corresponding relationship is established by the present invention through initializing, predicting and correcting the feature point information. In addition, the present invention screens the feature point information continuously through the feature evolving 120, such that the finally obtained feature point information has a stronger corresponding relationship, and the generated three-dimensional scene has a higher accuracy than before.

With the descriptions of invention, it will be obvious that the same may be varied in many ways. Such variations are not to be regarded as a departure from the principle and scope of the invention, and all such modifications as would be obvious to one skilled in the art are intended to be included within the scope of the following claims.

Claims

1. A method for corresponding, evolving and tracking feature points in a three-dimensional space, comprising the steps of: (a) initializing feature point information ({Yt}, {Xt}) of an object at a time t, wherein {Yt} represents a two-dimensional feature point group of the object, and {Xt} represents a three-dimensional state collection of each feature point in {Yt} of the object,(b) establishing an analysis model, for using the feature point information {Yt} and {Xt} at the time t to predict feature point information ({Yt+1},{Xt+1}) of the object at a next time t+1;(c) correcting the three-dimensional state collection of the object at the time t+1 into {{circumflex over (X)}t+1} through a correcting model; and(d) executing a state updating process for re-screening the feature point information ({{tilde over (Y)}t+1}, {{tilde over (X)}t+1}) of the object;wherein, when the object is determined to be present at a next time, the feature point information ({Yt}, {Xt}) is updated, wherein {Yt+1}={Yt+1}+{{tilde over (Y)}t+1} and {Xt+1}={Xt+1}+{{tilde over (X)}t+1}, and the steps (b)-(d) are carried on.
2. The method for corresponding, evolving and tracking feature points in a three-dimensional space as claimed in claim 1, further comprising a step of outputting {Yt+1}={Yt+1}+{{tilde over (Y)}t+1} and {Xt+1}={Xt+1}+{{tilde over (X)}t+1} when the object is determined not to be present at the next time.
3. The method for corresponding, evolving and tracking feature points in a three-dimensional space as claimed in claim 1, further comprising a step of inputting at least an image frame as the object in an image sequence.
4. The method for corresponding, evolving and tracking feature points in a three-dimensional space as claimed in claim 1, wherein the step (a) further comprises a state initializing process step of directly retrieving the feature point group {Y0} of the first image frame at the time t=0 and generating a corresponding three-dimensional state collection {xo}.
5. The method for corresponding, evolving and tracking feature points in a three-dimensional space as claimed in claim 4, wherein the three-dimensional state collection {xo} comprises a three-dimensional state collection of a horizontal position, a vertical position and a depth position of the each corresponding feature point in the three-dimensional space.
6. The method for corresponding, evolving and tracking feature points in a three-dimensional space as claimed in claim 5, wherein the estimation of the depth position further comprises a step of determining {Y0} and {Yt} through a three-dimensional vision manner and then projecting through a camera model.
7. The method for corresponding, evolving and tracking feature points in a three-dimensional space as claimed in claim 6, further comprising a step of generating {Yt} through {Y0} in a feature corresponding manner.
8. The method for corresponding, evolving and tracking feature points in a three-dimensional space as claimed in claim 1, further comprising a step of inputting a group of space points having three-dimensional track or motion mode on the time axis as the object.
9. The method for corresponding, evolving and tracking feature points in a three-dimensional space as claimed in claim 1, wherein the step (b) further comprises a system modeling process step of inputting a system model and a state description expression as the analysis model.
10. The method for corresponding, evolving and tracking feature points in a three-dimensional space as claimed in claim 9, wherein the analysis model is a Kalman filter time series analysis model.
11. The method for corresponding, evolving and tracking feature points in a three-dimensional space as claimed in claim 1, wherein the analysis model used in step (b) is taken into account by the correcting model to adjust the correcting model.
12. The method for corresponding, evolving and tracking feature points in a three-dimensional space as claimed in claim 11, wherein the correcting model is adjusted according to one motion mode of the object in the three-dimensional space.
13. The method for corresponding, evolving and tracking feature points in a three-dimensional space as claimed in claim 1, wherein the step (d) further comprises: (d1) finding out a new feature point by corresponding {Yt} and {Yt+1} and adding into {{tilde over (Y)}t+1}; and(d2) setting a weight, for updating the three-dimensional state collection of {{tilde over (X)}1+1}, wherein the weight is determined by a state value of neighboring feature points.
14. The method for corresponding, evolving and tracking feature points in a three-dimensional space as claimed in claim 13, wherein the weight in the step (d2) is determined by the state value of a feature point survival time of the neighboring feature points.
15. The method for corresponding, evolving and tracking feature points in a three-dimensional space as claimed in claim 13, wherein the weight in the step (d2) is determined by the state value of a distance from the neighboring feature points.
16. The method for corresponding, evolving and tracking feature points in a three-dimensional space as claimed in claim 13, wherein the step (d) further comprises: (d3) setting a threshold;(d4) deleting the feature point generating an error larger than the threshold during the feature corresponding from {Yt+1};(d4) deleting the feature point generating an error larger than the threshold during the feature corresponding from {{tilde over (Y)}t+1}; and(d6) calculating the feature point generating an error resulting from the system model analysis larger than the threshold during the prediction of {Xt+1 } and deleting the feature point.

Priority Claims (1)

Number	Date	Country	Kind
095136372	Sep 2006	TW	national

Method for corresponding, evolving and tracking feature points in three-dimensional space

Information

Publication Number

Date Filed

Date Published

Inventors

Original Assignees

CPC

US Classifications

International Classifications

Abstract

Description

Claims

Priority Claims (1)