This application claims priority from Korean Patent Application No. 2002-72695, filed on 21 Nov. 2002, in the Korean Intellectual Property Office, the disclosure of which is incorporated herein in its entirety by reference.
1. Field of the Invention
The present invention relates to visual servoing technology, and more particularly, to a hand/eye calibration method using a projective invariant shape descriptor of a two-dimensional image.
2. Description of the Related Art
Hand/eye calibration denotes a procedure for determining the spatial transformation between a robot hand and a camera mounted on the robot hand to obtain a desired image using visual servoing technology in a robot's coordinate frame and control the robot. One of the hand/eye calibration methods, which is most frequently used, is to provide a prior motion information and to obtain desired information from images transforms generated based on the provided motion information. In order to easily and correctly extract transformation information of the robot hand, it is very important to correctly select corresponding points between transformed images in this method.
The present invention provides a hand/eye calibration method which makes it possible to easily and correctly extract transformation information between a robot hand and a camera.
The present invention also provides a computer readable medium having embodied thereon a computer program for the hand/eye calibration method.
According to an aspect of the present invention, there is provided a hand/eye calibration method. The method includes (a) calculating a projective invariant shape descriptor from at least two images consecutively obtained through a camera mounted on a robot hand; (b) extracting corresponding points between the images by using the projective invariant shape descriptor; (c) calculating a rotation matrix for the corresponding points from translation of the robot; (d) calculating translation vector for the corresponding points from translation and rotation of the robot; and (e) finding a relation between the robot hand and the camera based on the rotation matrix calculated in step (c) and the translation vector calculated in step (d).
According to an aspect of the present invention, there is provided a method of extracting corresponding points between images. The method includes (a) defining errors for a projective invariant shape descriptor for a two-dimensional image from at least two images obtained at a predetermined interval and calculating noisy invariance; (b) calculating a threshold to be used to set corresponding points according to the noisy invariance; (c) extracting boundary data from the images and presenting the extracted boundary data by subsampling N data; (d) minimizing the projective invariant shape descriptor; (e) transforming a following image into the following image according to the minimized projective invariant shape descriptor; (f) resetting distance between boundary data in consideration of the ratio of distance between boundary data before the transformation to distance between boundary data after the transformation; and (g) finding similarities between the boundary data and extracting corresponding points between the previous images and the following image.
The above and other features and advantages of the present invention will become more apparent by describing in detail exemplary embodiments thereof with reference to the attached drawings in which:
The present invention will now be described more fully with reference to the accompanying drawings, in which preferred embodiments of the invention are shown. In the drawings, like reference numerals refer to like elements throughout.
where (u, v, 1) denotes a coordinates of a point q defined on the image plane, (X, Y, Z, 1) denotes a coordinates of a point P in an object coordinate system, and tij denotes an ij factor of a transformation matrix between an object plane and the image plane.
Here, if an object is projected to on a two-dimensional plane, i.e., Z=0, Equitation 1 is transformed as follows.
As shown in Equations 1 and 2, the process for obtaining an image is performed in non-linear environment. However, a linearized projective transformation is adopted to a two-dimensional image obtained through the CCD camera 110 rather than a non-linear projective transformation like in Equation 2.
A Fourier descriptor is a linearized shape descriptor which satisfies Equations 1, 2, and 3. The Fourier descriptor represents an image of the object with Fourier coefficients which are obtained by a two-dimensional Fourier transformation for the image contour of a two-dimensional object. However, this method can be applied only to a case where linearity of a camera is guaranteed, that is, where a distance between the CCD camera 110 and the object is too long. Therefore, to overcome the restriction, the image obtained from the CCD camera 110 is analyzed by using a projective invariant shape descriptor I in the present invention. As a result, even in a case where the linearity of the camera is not guaranteed, that is, the distance between the CCD camera 110 and the object is not long, the image can be analyzed correctly without being affected by noise, slant angles, or the nonlinearity of the CCD camera 110 occurring when images are obtained.
In the hand/eye calibration method according to the present invention, the pin-hall camera model is used in which distortion of the lens 114 or misalignment of a light axis can be ignored. The relations between the robot hand 100 of
Xh=RXc+t (4)
where Xh denotes the world coordinate system of the robot hand 100, i.e., Xc denotes a coordinate system of the CCD camera 110, R denotes a rotation matrix, and t denotes a translation vector.
The relation between the CCD camera 110 and the image can be expressed as follows.
where, u and u0 denote X coordinates on an image plane, and v and v0 denote Y coordinates on an image plane. In addition, f denotes a focal length between the lens 114 and the CCD array 112, and Sx and Sy denote scale factors of the CCD camera 110. The focal length f and scale factors Sx and Sy are characteristic values which indicate original characteristics of the CCD camera 110 and they are fixed according to a specification of the CCD camera 110.
If robot motion information already known to the user XP1=Rp1Xp2+tp1 is introduced in Equation 4, the following Equation is obtained.
RXc1+t=Rp1(RXc2+t)+tp1 (6)
The motion of the CCD camera 110 Xc1 can be expressed as follows by using Equation 6.
If the rotation matrix is excluded from Equation 7 and only translation is considered, the rotation matrix R can be expressed as follows.
tc1=R−1tp1 (8)
Equation 8 can be expressed by substituting tp1 with three motion vectors of the robot, tp1, tp2, and tp3 as follows.
(tc′1, tc′2, tc′3)=R−1(tp1, tp2, tp3) (9)
Here, image vectors corresponding to three motion vectors of the robot hand, tp1, tp2, and tp3 are OF1, OF2, and OF3, and each image vector is defined by the following Equation 10,
where
and
Intrinsic parameters can be calculated as follows.
Equation 11 can be expressed as follows,
u0(u2−u3)+s1(v2−v3)−s2v1(v2−v3)=u1(u2−u3)
u0(u1−u3)+s1(v1−v3)−s2v2(v1−v3)=u2(u1−u3) (12)
where
and
Equation 12 can be expressed in a matrix form as follows.
In consideration of rotation and translation of the robot hand 100, the translation vector t between the robot hand 100 and the CCD camera 110 can be expressed as follows,
tc1=R−1(Rp1t+tp1−t)
t=(Rp1−I)−1(Rtc1−tp1) (14)
where (Rp1, tp1) denotes motion information already known by the user, and R denotes a rotation matrix which is calculated from three rotations of the robot hand 100. tc′1 denotes an image vector, and I denotes a projective invariant shape descriptor calculated from a two-dimensional image. In order to improve the precision of the hand/eye calibration, it is very important to correctly set points corresponding to coordinates which are predetermined within the field of view of the CCD camera 110. Therefore, in the present invention, corresponding points are obtained by the CCD camera 110 by using the projective invariant shape descriptor which does not vary under nonlinear transformation. Then, the corresponding points are used as calibration targets to conduct hand/eye calibration. The projective invariant shape descriptor I, which is used as a fundamental factor of the hand/eye calibration, can be defined as follows.
where P denotes points of the object, q denotes corresponding points of the image as shown in
The projective invariant shape descriptor I expressed in Equations 15 and 16 represents information which does not vary under nonlinear transformation as shown in Equation 2 and does not vary though images obtained by the CCD camera 110 are transformed.
where X1(k)=(X(k), Y(k),1),
and 1≦k≦N, and X(k) and Y(k) denotes X and Y axis coordinate function of the contour.
An example of projective invariant shape descriptor calculated from a 2-dimensional shape of
Corresponding points of the images obtained by the CCD camera 110 by using the projective invariant shape descriptor I can be extracted as follows.
In order to extract corresponding points of the images, errors in the projective invariant shape descriptor I have to be defined. In the present invention, the errors are defined using a Gaussian noise model. In order to use the Gaussian noise model, Equation 17 can be expressed as follows.
where Xi=(xi, yi, 1)T or I=I(xi, y1, x2, y2, x3, y3, x4, y4, x5, y5).
Here, if (xi, yi) is true data, and ({tilde over (x)}i, {tilde over (y)}i) is a noisy observation parameter, the noise observation parameter can be expressed as follows,
{tilde over (x)}i=xi+ξi, {tilde over (y)}i=yi+ηi (19)
where noise terms ξi and ηi are distributed noise terms, and their mean and variance are 0 and σi2. The noise terms can be expressed as follows.
Noisy invariant can be expressed as follows after noisy measurements on the image are observed.
Ĩ({tilde over (x)}1,{tilde over (y)}1,{tilde over (x)}2, {tilde over (y)}2,{tilde over (x)}3,{tilde over (y)}3, {tilde over (x)}4,{tilde over (y)}4,{tilde over (x)}5,{tilde over (y)}5) (21)
In order to calculate an expected value and a variance of the noisy invariant Ĩ, the noisy invariant Ĩ can be expressed with (x1, y1, x2, y2, x3, y3, x4, x4, y4, x5, y5) by using Talyor series.
Here, the variance can be expressed as follows.
A threshold of the noisy invariant can be defined as follows.
ΔI=3×√{square root over (E[({tilde over (I)})}−I)2] (24)
The corresponding points are found from the images obtained by the CCD camera 110 of the robot hand 100 by repeating calculation of the projective invariant shape descriptor, and boundary data between a previous image and a following image consecutively obtained through the CCD camera 110 can be expressed as follows.
OkIn={XIkIn, YIkIn}, k=1˜nIn
OkMo={XIkMo, YIkMo}, k=1˜nMo (25)
where nIn and nMo denote the number of points in a boundary between a scene and a model. The boundary data are presented by subsampling N data, and this subsampling can be expressed as follows.
qiIn={Xτ1(i)In, Yτ1(i)In}, qiMo={Xτ2(i)Mo, Yτ2(i)Mo}, i=1˜N (26)
where
and N denotes the number of points on a normalized contour.
Then, a projective invariant shape descriptor is calculated by using qiIn and qiMo defined in Equation 26 when a value of the following Equation is minimized,
where A, b, c, and d denote variants defined from transformation between the previous image and the following image and can be expressed as follows.
The weight wi of Equation 27 is calculated by the variance defined in Equation 23 as follows.
The projective invariant shape descriptors can be minimized by Equations 27 through 30 as follows.
After the projective invariant shape descriptors are minimized by using Equation 31, the following image obtained by the CCD camera 110 is transformed into a previous image. This transformation can be expressed as follows.
qi′Mo′=(C·qiMo+d)−1(A qiMo+b), i=1˜N (32)
where A, b, c, and d denotes variance defined transformation between the previous image and the following image.
After the transformation between the previous image and the following image is completed, the ratio of the distance between boundary data before the transformation to the distance between boundary data after the transformation, i.e., τ2(i)′, is calculated by using the following Equation, and then the length between data is reset by using the ratio.
By using the ratio τ2(i)′ calculated by Equation 33, the following image can be resampled as follows.
qiIn′=Oτ2(i)In, i=1˜N (34)
After that, in order to include errors between the previous image and the following image within a predetermined scope, Equations 29 through 34 are repeated. The errors are expressed by errors of the corresponding points and similarities between the projective invariant shape descriptors (Im, Ii) of the boundary data. A similarity value of the projective invariant shape descriptors (Im, Ii) of the boundary data can be expresses as follows.
If the maximum value of the similarity values is greater than a predetermined threshold, it is determined that the corresponding points of the previous image and the following image are the same as each other. The value of ΔI in Equation 35 and the predetermined threshold are selected according to an environment to which the present invention is applied and a required precision.
Then, a rotation matrix R for a coordinate, i.e., the extracting corresponding points, is calculated from a translation of the robot hand 100 (step 230). Here, the rotation matrix R is calculated by Equation 8. A translation vector t for the coordinate, i.e., the extracted corresponding points, is calculated from translation and rotation of the robot hand 100 (step 240). Here, the translation vector t is calculated by Equation 14. After completion of steps 230 and 240, the hand/eye calibration which defines a relation between the robot hand 100 and the CCD camera 110, that is, obtains a calculation result of Xh=RXc+t, is completed (step 250). Here, Xh denotes a coordinate system of the robot hand 100, and Xc denotes a coordinate system of the CCD camera 110 mounted on the robot hand 100.
In order to accurately perform the hand/eye calibration, errors of the projective invariant shape descriptor I of images are defined (step 2200), and noisy invariant is calculated by analyzing an amount of noises of the images (step 2210). Then, a threshold is calculated according to the noisy invariant calculated in step 2210 (step 2220).
Boundary data of a previous image and a following image obtained by the CCD camera 110 are extracted (step 2230). The extracted boundary data is presented by subsampling N data (step 2240). Then, a projective invariant shape descriptor is minimized in accordance with Equation 31 (step 2250), and the following image is transformed into a previous image in response to the minimized projective invariant shape descriptor (step 2260). After that, the distance between boundary data is reset by using the ratio of the distance between boundary data before the transformation to the distance between boundary data after the transformation (step 2270).
After the distance between the boundary data is reset in step 2270, similarities between the boundary data of the previous image and the following image are founded (step 2280), and corresponding points between the previous image and the following image are extracted by using the found similarities (step 2290).
The hand/eye calibration method according to the present invention extracts corresponding points between the previous image and the following image by using the projective invariant shape descriptor I. Thus, it is possible to accurately perform the hand/eye calibration without being affected by noise or nonlinearity of a camera.
In the present invention, the hand/eye calibration method is applied to the robot hand. However, the hand/eye calibration method can be applied to various kinds of visual servoing devices which control motions of an object by using images obtained through at least one camera.
The present invention may be embodied as a computer readable code in a computer readable medium. The computer readable medium includes all kinds of recording device in which computer readable data are stored. The computer readable medium includes, but not limited to, ROM's, RAM's, CD-ROMs, magnetic tapes, floppy disks, optical data storage device, and carrier waves such as transmissions over the Internet. In addition, the computer readable medium may be distributed to computer systems which are connected via a network, be stored and embodied as the computer readable code.
While the present invention has been particularly shown and described with reference to exemplary embodiments thereof, it will be understood by those of ordinary skill in the art that various transforms in form and details may be made therein without departing from the spirit and scope of the present invention as defined by the following claims.
Number | Date | Country | Kind |
---|---|---|---|
10-2002-0072695 | Nov 2002 | KR | national |
Number | Name | Date | Kind |
---|---|---|---|
5227985 | DeMenthon | Jul 1993 | A |
5550376 | Gupta et al. | Aug 1996 | A |
Number | Date | Country | |
---|---|---|---|
20040102911 A1 | May 2004 | US |