Hereinafter preferred embodiments of the invention will be described with reference to the appended drawings.
E(x,y,t)=E(x+Δx,y+Δy,t+Δt) (1)
the spatial gradient of the brightness is defined by following Equation (5); and
the temporal gradient of the brightness is defined by following Equation (6).
E
x
·u+E
y
·v+E
t=0 (7)
Equation (7) is a brightness gradient constraint equation. Parameters u and v that satisfy Equation (7) define the optical flow, i.e., the motion vector to be obtained.
As shown in
E
x=(E(i+1,j,t)−E(i,j,t))+(E(i+1,j,t+Δt)−E(i,j,t+Δt))+(E(i+1,j+1,t)−E(i,j+1,t))+(E(i+1,j+1,t+Δt)−E(i,j+1,t+Δt)) (8)
E
y=(E(i,j+1,t)−E(i,j,t))+(E(i,j+1,t+Δt)−E(i,j,t+Δt))+(E(i+1,j+1,t)−E(i+1,j,t))+(E(i+1,j+1,t+Δt)−E(i+1,j,t+Δt)) (9)
E
t=(E(i,j,t+Δt)−E(i,j,t))+(E(i,j+1,t+Δt)−E(i,j+1,t))+(E(i+1,j,t+Δt)−E(i+1,j,t))+(E(i+1,j+1,t+Δt)−E(i+1,j+1,t)) (10)
In the present invention, in order to accurately estimate the time to contact, the positional relationship between the object (a planar surface 1 in the figures) and the image sensor (a camera 2 in the figures) is classified into four cases shown in
Note that, in the following description, the term “the degree of freedom of the camera 2” is defined by whether or not the camera 2 moves along the optical axis 2a thereof, i.e., the degree of freedom of the camera 2 is deemed to be constrained when the camera 2 moves along the optical axis 2a, and the degree of freedom of the camera 2 is deemed not to be constrained when the camera 2 moves not necessarily along the optical axis 2a. Furthermore, the term “the degree of freedom of the planar surface 1” is defined by whether or not the planar surface 1 is oriented perpendicular to the optical axis 2a, i.e., the degree of freedom of the planar surface 1 is deemed to be constrained when the planar surface 1 is oriented perpendicular to the optical axis 2a, and the degree of freedom of the planar surface 1 is deemed not to be constrained when the planar surface 1 is not necessarily oriented perpendicular to the optical axis 2a.
In Case 1, as shown in
u=x(W/Z), v=y(W/Z) C1-(1)
and the time to contact TTC is expressed as follows,
TTC=Z/W C1-(2)
where W is speed of the camera 2 along the Z axis shown in
By referring to above Equation (7), the brightness gradient constraint equation shown below is satisfied,
rE
r
C+E
t=0 C1-(3)
where C(=W/Z) is the inverse of the TTC.
In addition, following Equations C1-(4) are satisfied, where r is the radial distance from the principal point.
rE
r
=xE
x
+yE
y
, r=√{square root over (x2+y2)} C1-(4)
The unknown parameter C can be found based on Equation C1-(3) using the least squares method. By squaring the both sides of Equation C1-(3) and taking the sum, following summed-up term C1-(5) is obtained.
Σ(rErC+Et)2 C1-(5)
Summed-up term C1-(5) is over all pixels of a region of interest (ROI), or possibly over the whole image. To find the best fit value of C, summed-up term C1-(5) is differentiated with respect to C, and the result is set to be zero.
2Σ(rErC+Et)rEr=0 C1-(6)
CΣ(rEr)2=−ΣrErEt C1-(7)
C=−ΣrE
r
E
t/Σ(rEr)2 C1-(8)
With reference to
The computation of the spatial and temporal derivatives preferentially uses a 2×2×2 cube of pixel brightness values (Horn & Schunck 1981). However, other methods of estimating derivatives can be used also, such as central differences, including methods that depend on more than two frames of the image sequence (although this may increase latency).
No significant amount of storage for intermediate results is needed, since the derivatives computed in step (1) are immediately used in step (2), and the terms computed in step (2) can be added into the accumulators in the immediately following step and so need not be retained.
Contributions to the accumulated totals can be weighted, so that, for example, more trustworthy information contributes more heavily to the final result. If this is done, the same weight factor should be applied to contributions to both accumulators. Information from an area of the image that has no significant texture, for example, may not provide a useful contribution to the accumulated totals. Also, areas in which specular highlights are observed may be suppressed in order to reduce error introduced by specular reflections.
It should be clear that the computation is very straightforward, with a number of arithmetic steps proportional to the number of pixels in the region of interest, that no search or iteration is involved and there is no need to solve complex equations.
Further, no detection of image “features” such as edges or grey-level corners is required. This makes it possible to implement the method on a small microprocessor (such as an Analog Devices BlackFin microprocessor) or even in special purpose circuitry such as FPGA, PLD or ASIC. In fact, the circuitry for performing the computation could even be integrated with the imaging sensor on the same. The same observations apply to the other cases studied below.
In Case 2, as shown in
u=−f·U/Z+x·W/Z, v=−f·V/Z+y·W/Z C2-(1)
−AE
x
−BE
y
+rE
r
C+E
t=0 C2-(2)
G=rE
r
=xE
x
+yE
y
, r=√{square root over (x2+y2)} C2-(3)
Σ(−AEx−BEy+rErC+Et)2 C2-(4)
As in Case 1, summed-up term C2-(4) is over all pixels of a region of interest (ROI), or possibly over the whole image. To find the best fit values of A, B, and C, summed-up term C2-(4) is differentiated with respect to A, B, and C, and the three results are set to be zero.
Σ(−AEx−BEy+rErC+Et)Ex=0 C2-(5)
Σ(−AEx−BEy+rErC+Et)Ey=0 C2-(6)
Σ(−AEx−BEy+rErC+Et)rEr=0 C2-(7)
These three linear equations containing the three unknown parameters A, B, and C can be solved easily using known methods such as Gaussian elimination.
With reference to
Note that the six distinct quantities in the symmetric 3×3 coefficient matrix, and the three quantities in the right hand side vector are all sums of products of various combinations of brightness derivatives Ex, Ey, and Et and image coordinates x and y.
The time to contact is the inverse of C=(W/Z). The focal length f (principal distance) need not be known to compute the time to contact. As noted, Case 1 discussed above is a special case of case 2, with A=B=0. In that special case, only the third of the three equation above needs to be considered, and two of its terms drop out.
In addition to the time to contact, if desired, one can calculate the direction of translational motion by noting that
U/W=(A/C)/f and V/W=(B/C)/f.
The results can be computed using essentially the same sequence of eight steps as in Case 1 above, except that now 9 (6 plus 3) accumulators are needed instead of two, and nine terms per picture cell need to be computed instead of just two.
If desired, the stability of the solution can be analyzed by determining the eigenvalues and eigenvectors of the symmetric 3×3 coefficient matrix above. The result is well constrained if none of the eigenvalues are particularly small. Conversely, the accuracy can be expected to be low if at least one of the eigenvalues is small.
In Case 3, as shown in
When p and q are defined to be the slopes in the X and Y direction of the planar surface 1 measured in the imaging system coordinate system, the planar surface 1 is expressed by the following Equation C3-(1).
Z=Z0+pX+qY C3-(1)
Z(1−p(x/f)−q(y/f))=Z0 C3-(2)
rE
r(1−p·x/f−q·y/f)(W/Z0)+Et=0 C3-(3)
rE
r(C−Px−Qy)+Et=0 C3-(4)
As in Cases 1 and 2, the unknown parameters P, Q, and C can be found based using the least squares method, where summed-up term is as follows.
Σ(rEr(C−Px−Qy)+Et)2 C3-(5)
As in Cases 1 and 2, summed-up term C3-(5) is over all pixels of a region of interest (ROI), or possibly over the whole image. To find the best fit values of P, Q, and C, summed-up term C3-(5) is differentiated with respect to A, B, and C, and the three results are set to be zero.
Σ(rEr(C−Px−Qy)+Et)rErx=0 C3-(6)
Σ(rEr(C−Px−Qy)+Et)rEry=0 C3-(7)
Σ(rEr(C−Px−Qy)+Et)rEr=0 C3-(8)
These three linear equations containing the three unknown parameters P, Q, and C can be solved easily using known methods such as Gaussian elimination.
With reference to
Note that the nine quantities in the symmetric 3×3 coefficient matrix, and the three quantities in the right hand side vector are all sums of products of brightness derivatives Ex, Ey, and Et and image coordinates x and y.
The time to contact is the inverse of C=(W/Z0). The focal length f (principal distance) need not be known to compute the time to contact. It should be noted again that Case 1 discussed above is a special case of this one with P=Q=0. In this case, only the third of the three equations above need be considered, and two of its terms drop out.
If desired, one can also calculate the orientation of the planar surface by noting that
p=f(P/C) and q=f(Q/C).
The results can be computed using essentially the same sequence of eight steps as in case 1 above, except that now 9 (6 plus 3) accumulators are needed instead of two, and nine terms per picture cell need to be computed instead of just two.
If desired, the stability of the solution can be analyzed by determining the eigenvalues and eigenvectors of the symmetric 3×3 coefficient matrix above. The result is well constrained if none of the eigenvalues are particularly small. Conversely, the accuracy can be expected to be low if at least one of the eigenvalues is small.
The general approach should be clear from the above three cases worked out in detail above. Below described is an additional extension to more general case.
In Case 4, as shown in
Combining the equations for the motion field from Case 2 and the equation for the planar surface from Case 3 in the brightness change constraint equation, following Equation C4-(1) is obtained.
(A′Ex+B′Ey−rEr)(−Px−Qy+C)+Et=0 C4-(1)
where A′=f·(U/W),B′=f·(V/W),P=(p/f)·(W/Z0),Q=(q/f)·(W/Z0 )
As before, C=(W/Z0) is the inverse of the time to contact. This can also be written in the alternate form below.
(AEx+BEy−C(rEr))(−P′x−Q′y+1)+Et=0 C4-(2)
where A=f·(U/Z0),B=f·(V/Z0), P′=p/f,Q′=q/f
Note that Case 1 is just a special Case of case 4, with A=B=U=V=0, and correspondingly, Cases 2 and 3 are special cases of this general case with A=B=0 and U=V=0, respectively.
The least-squares method is applied to find the five unknown parameters A′, B′, C, P and Q, where summed-up term is as follows.
Σ((A′Ex+BEy−rEr)(−Px−Qy+C)+Et)2 C4-(3)
Similarly, the least-squares method is applied to find the five unknown parameters A, B, C, P′ and Q′, where summed-up term is as follows.
Σ((AEx+BEy−C(rEr))(−P′x−Q′y+1)+Et)2 C4-(4)
Summed-up terms C4-(3) and C4-(4) are over all pixels of a region of interest (ROI), or possibly over the whole image. Note that A, B, P, Q are related to A′, B′, P′ and Q′ by the following equations.
A=A′C, B=B′C, P=P′C, Q=Q′C
To find the best fit values of the five unknown parameters we can differentiate either of the two sums with respect to the five parameters and set the results equal to zero. This leads to five equations in five unknowns. All of the coefficients of these equations are sums of products of image brightness derivatives and image coordinates, as before. Unlike Cases 1 to 3 described above, however, the equations are no longer linear and so not quite so easy to solve, and two solutions will be discussed below.
In Solution 1, an iteration process is applied to find unknown parameters defining the time to contact (TTC). First, parameters P′ and Q′ are treated as constants, then the equations derived from summed-up term C4-(4) are linear in the remaining unknown parameters A, B, and C, and known linear solution methods can be applied to obtain the best fit values of A, B, and C. Next, parameters A′ and B′ are treated as constants, then the equations derived from summed-up term C4-(3) are linear in the remaining unknown parameters P, Q, and C, and known linear solution methods can be applied to obtain the best fit values of P, Q, and C. By executing an iteration process for these calculations, approximation of the TTC can be obtained.
The specific procedure is now explained. First, parameters P′ and Q′ are treated as constants. To find the best fit values of A, B, and C, summed-up term C4-(4) is differentiated with respect to A, B, and C, and the three results are set to be zero.
AΣE
x
2
F
2
+BΣE
x
2
E
y
F
2
−CΣrE
r
E
x
F
2
=−ΣE
x
E
t
F C4-(5)
AΣE
x
2
F
2
+BΣE
x
2
E
y
F
2
−CΣrE
r
E
x
F
2
=−ΣE
y
E
t
F C4-(6)
AΣE
x
2
F
2
+BΣE
x
2
E
y
F
2
−CΣrE
r
E
x
F
2
=−ΣrE
r
E
t
F C4-(7)
where F=1−(p/f)x−(q/f)y.
Next, parameters A′ and B′ are treated as constants. To find the best fit values of A, B, and C, summed-up term C4-(3) is differentiated with respect to P, Q, and C, and the three results are set to be zero.
PΣD
2
x
2
+QΣD
2
xy−CΣD
2
x=ΣD xE
t C4-(8)
PΣD
2
xy+QΣD
2
y
2
−CΣD
2
y=ΣD yE
t C4-(9)
PΣD
2
x+QΣD
2
y−CΣD
2
=ΣDE
t C4-(10)
where D=A′Ex+B′Ey−rEr.
Thus, given an initial guess, one can alternately solve for A, B, and C, assuming P and Q are fixed, and then, using the new estimates of A, B, solve for P, Q and C, assuming that A and B are fixed. A few iterations of this pair of steps typically yields a close enough approximation to the exact solution.
With reference to
By using Solution 2, an approximation to the TTC can be obtained.
First, following terms C4-(11) and C4-(11) are derived from Equations C4-(1) and C4-(2).
(Ex Ey −rEr)(A B C)T(−P′ −Q′ 1)(x y 1)T+Et C4-(11)
(Ex Ey −rEr)(A′ B′ C)T(−P −Q C)(x y 1)T+Et C4-(12)
Suffix T denotes the transpose, which turns a row vector into a column vector (and vice versa). These expressions in turn can be rewritten using the product of a vector, a 3×3 matrix M, and another vector:
(Ex Ey −rEr)M(x y 1)T+Et C4-(13)
where M is the dyadic product of the vectors (A B C)T and (−P′ −Q′ 1)T, or equivalently of (A′ B′ 1)T and (−P −Q C)T, namely:
A linear least squares method may now be used to find the nine unknown coefficients of the 3×3 matrix M, if these parameters are treated as independent variables for the moment. The term above can be written in the form (eTm+Et), and so the sum to be minimized is
Σ(eTm+Et)2 C4-(15)
where m is a vector obtained by “flattening” the matrix M, that is, by lining up the nine coefficients of the matrix, and e is a vector obtained by flattening the dyadic product of the vectors (−Ex −Ey (rEr))T and (x y 1)T.
m=Flatten(((A B C)T(−P′ −Q′ 1)) C4-(16)
m=Flatten(((A′ B′ 1)T(−P −Q C)) C4-(17)
e=Flatten(((Ex Ey −rEr)T(x y 1)) C4-(18)
From Equations C4-(16), C4-(17), and C4-(18), following Equations C4-(19), C4-(20), and C4-(21) are obtained.
m=(−AP′ −AQ′ A −BP′ −BQ′ B −CP′ −CQ′ C) C4-(19)
m=(−A′P −A′Q A′C −B′P −B′Q B′C −P −Q C) C4-(20)
e=(xEx yEx Ex xEy yEy Ey −xrEr −yrEr −rEr) C4-(21)
2Σ(eTm+Et)e=0 C4-(22)
2Σ(eeT)m=−ΣEte C4-(23)
This is a set of 9 linear equations in the nine coefficients of the matrix which can be solved using known methods, such as Gaussian elimination. The coefficients of the symmetric 9×9 matrix and the vector on the right hand side of the equation are all sums of products of derivatives of brightness and image coordinates. These terms can be accumulated as before, except that now there are 54 (45 plus 9) accumulators, and 54 terms per picture cell need to be computed for addition into these accumulators.
Finally, the five unknown parameters A, B, C, P′, and Q′ or equivalently A′, B′, C, P and Q can be recovered using singular value decomposition of the matrix M.
Alternatively, in this special case, they can be found by noting that each column of the matrix should be proportional to (A, B, C), and each row of the matrix should be proportional to (−P, −Q, C). In fact, the four unknown parameters A, B, P, and Q can be read off from the last row and the last column of the matrix M, reconstituted from the flattened version, the vector m.
However, the 3×3 matrix resulting from this computation may have an arbitrary factor lambda (λ) added to each of the three diagonal terms since
−xE
x
−yE
y+(rEr)=0.
The correct diagonal can be determined using singular value decomposition, or in this particular case more easily from relationships with the off diagonal terms:
M11=M12·M31/M32=M21·M13/M23 C4-(24)
M22=M12·M23/M13=M21·M32/M31 C4-(25)
M33=M32·M13/M12=M23·M31/M21 C4-(26)
Solution 2 does not solve exactly the same least squares problem as that solved by Solution 1 described above, but does produce a result in convenient closed form. In the absence of measurement noise, the results will be the same. With noise in the measurement of image brightness, and hence in Ex, Ey and Et, the result of Solution 2 will be an approximation to the original solution. This is because the nine elements of the matrix M are not independent, but depend on the five unknown parameters. This approximate solution may be sufficient enough to be used directly or can be used as an initial guess for the non-linear numerical solution of the original least-squares problem using iterative methods (i.e., for Solution 1).
As before, the time to contact is the inverse of the parameter C. This calculation can be performed without knowing the principal distance f.
If desired, the direction of translational motion, given by
U/W=(A/C)/f C4-(27)
V/W=(B/C)/f C4-(28)
can also be calculated, as can the orientation of the surface specified by
p=f(P/C) and q=f(Q/C).
With reference to
The specific embodiments described above are exemplary only and it will be apparent to the person of ordinary skill in the art that numerous alternatives exist as to the exact details of how the time to contact is computed using derivatives of brightness. Furthermore, other combinations of translational and rotational motion and other surface shapes can be treated using this method.
If there are multiple moving objects, the images can be segmented into regions of interest corresponding to images of individual objects.
The time to contact is often considered to be the time until the center of projection of the sensor intersects the surface of the object being viewed. If the sensor is mounted on an extended object of known shape, then the time when that shape first intersects with the object being viewed can be determined.
In the computation of the time to contact it is typically assumed that the motion parameters do not change. If the sensor or the object on which it is mounted has known dynamics and known external disturbances to its motion, then a dynamic model may be created and updated using, for example, a Kalman filter.
The image sensor need not be one for visible light, but could, for example, be a sensor of infrared (IR), ultraviolet (UV) or other electromagnetic radiation.
The imaging system need not be one that takes discrete snap-shots or “frames”. It could, for example, have detectors that operate continuously in time, such as photo diodes. Time derivatives can be determined from such continuous signals using electrical circuits known in the art.
The imaging system need not produce an actual image output, but instead may produce only the derivatives of brightness in the image needed in the computation of the time to contact, since the image brightness itself does not enter directly into the computation.
The image sensor need not have picture cells on a rectangular grid, but could, for example, be based on a hexagonal or a triangular grid, or even be continuous in space.
This may be of importance for some image sensors that are only responsive to changes in radiation, not being able to measure the absolute irradiance itself.
The calculation of brightness derivatives and sums of products of derivatives and image coordinates need not be performed digitally, but could instead be done in the analog domain, or in a hybrid fashion. Multiplications, for example, could be accomplished using four-quadrant analog multipliers or instead by using multiplying digitial-to-analog (D/A) converters (McQuirk et al 1998).
The summation of sums of products of derivatives and coordinates could be accomplished in the analog domain also, by, for example, summing currents flowing through a resistor or summing charges deposited into a capacitor (McQuirk et al 1998). Analog and hybrid methods for performing the computation may have advantages in terms of component count or fabrication costs.