1. Field of the Invention
The present invention relates to an apparatus that estimates a distance or the like to a subject by using a range image sensor and two or more image sensors.
2. Description of the Related Art
There is proposed a method of obtaining a three-dimensional image by combining a distance image obtained through a range image sensor and a distance image obtained through each of a stereo camera (see Japanese Patent Application Laid-Open No. H09-005050).
There is proposed a method of generating a space map for a legged mobile robot by approximating each of a plurality of scattered local regions of a floor face by one type of patch among a plurality of types of known shaped patches (generating a patch map), and collecting the patch map (see “Curved Surface Contact Patches with Quantified Uncertainty”, Marsette Vona and Dimitrios Kanoulas, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems, Pages: 1439-1446).
However, since a contact state between a foot and the ground immediately after the floating foot from the ground comes into contact with the ground is not estimated, there is a possibility that it is difficult to control a posture of a robot stably in some grounding states after the foot actually comes into contact with the ground.
Thus, it is an object of the present invention to provide an apparatus capable of precisely estimating a future contact state between different objects such as a foot of a legged mobile robot and a floor face.
The present invention relates to a contact state estimating apparatus and a method for estimating a contact state between a surface of an actual object and a virtual face which is a surface of a virtual object having a designated shape by using a pair of image sensors configured to acquire each of a pair of images composed of a plurality of pixels having a designated physical quantity as a pixel value by imaging the actual object, and a range image sensor configured to measure a distance up to the actual object and to allocate the measured distance with respect to each of a plurality of pixels composing a target region of one of the images obtained by one of the image sensors among the pair of image sensors.
The contact state estimating apparatus of the present invention comprises: a first processing element configured to calculate a sum of a value of a first cost function, having a first deviation allocated to each pixel in the target region as a main variable, as a first cost; a second processing element configured to calculate a sum of a value of a second cost function, having a second deviation allocated to each pixel in the target region or both of the first deviation and the second deviation as a main variable, as a second cost; and a third processing element configured to search for a position and a posture of the virtual face in a case of contacting the surface of the actual object so as to approximate a total cost of the first cost and the second cost to a smallest value or a minimum value.
“The first deviation” is defined as a variable such that a magnitude of an absolute value is determined according to a length of an interval between an actual point whose position is determined by a distance to the actual object obtained by the range image sensor and a virtual point which is a result of projecting the actual point to the virtual face in a line of sight direction of the one of the image sensors, and being a positive value in a case where the virtual point is positioned farther than the actual point with reference to the one of the image sensors, while being a negative value in a case where the virtual point is positioned nearer than the actual point with reference to the one of the image sensors. “The second deviation” is defined as a variable such that a magnitude is determined according to a magnitude of a difference between the designated physical quantity possessed by the pixels of the one of the images and the designated physical quantity possessed by the pixels of the other image obtained by the other image sensor corresponding to the pixels of the one of the images in a form according to the position and the posture of the virtual face.
Each of “the first cost function” and “the second cost function” is defined as a function which shows a smallest value or a minimum value in a case where a value of the main variable is 0 and which is an increasing function in a positive definition domain.
According to the contact state estimating apparatus of the present invention, in a case where a state of at least a part of the virtual object enters into the actual object is set, the first deviation becomes a positive value. Therefore, as the virtual point is at a deeper location inside and away from the surface of the actual object according to this setting, the first cost or both of the first and the second cost is evaluated higher, and the total cost is also evaluated higher.
Accordingly, by searching the position and the posture of the virtual face so as to enable to approximate the total cost to the smallest value or the minimum value, it can be estimated that the position and the posture of the virtual face determined by a virtual point group in which all of the virtual point is not inside the actual object, and in which he virtual points are positioned on the surface of the actual object is the contact state of the virtual face with respect to the actual object.
Thereby a situation of an infeasible state, where a common real space is occupied at the same time by each of one actual object which is the imaging target and other actual object corresponding to the virtual object, being searched as a contact state of the two actual objects can be surely avoided. The other actual object corresponding to the virtual object means an actual object having a surface of a same shape and size with the virtual face or an actual object capable of adaptively forming the surface. Further, a future contact state of the virtual object with respect to the actual object can be estimated precisely.
According to the contact state estimating apparatus of the present invention, it is preferable that the first cost function is defined by a product between a positive exponentiation of an absolute value of the first deviation and a first coefficient function being a function having the first deviation as the main variable and a range of value being 0 or more and a value in a positive definition domain being larger than a value in a negative definition domain even in a case where absolute values thereof are same, and the second cost function is defined by a product between a positive exponentiation of an absolute value of the second deviation and a second coefficient function being a function having the first deviation or the second deviation as the main variable and a range of value being 0 or more and a value in the positive definition domain being larger than a value in the negative definition domain even in a case where absolute values thereof are same.
According to the thus configured contact state estimating apparatus, the first and the second coefficient functions are defined so as to calculate one of or both of the first and the second costs according to one of or both of the first and the second cost functions which are asymmetric in the positive and negative definition domains. The first cost corresponds to a total of elastic energy, in which the first deviation being a deformation amount, of a virtual spring group having a value of the first coefficient function in each pixel included in the target region of one of the images as a spring coefficient. Similarly, the second cost corresponds to a total of elastic energy, in which the second deviation being a deformation amount, of a virtual spring group having a value of the second coefficient function in each pixel included in the target region of one of the images as a spring coefficient.
Therefore, the position and posture of the virtual face can be searched in a form of giving priority to bring the virtual point located inside the actual object closer to the surface of the actual object rather than to bring the virtual point located outside the actual object closer to the surface of the actual object. Thereby a situation of an infeasible state, where a common real space is occupied at the same time by each of actual object which is the imaging target and other actual object corresponding to the virtual object, being searched as a contact state of the two actual objects can be surely avoided. Further, a future contact state of the virtual object with respect to the actual object can be estimated precisely.
According to the contact state estimating apparatus of the present invention, it is preferable that one or both of the first and the second coefficient functions is defined as an increasing function at least in the positive definition domain.
According to the thus configured contact state estimating apparatus, the value of the spring coefficient of the virtual spring becomes a larger value as the deformation amount of the spring is larger. Therefore, as the virtual point is at a deeper location inside and away from the surface of the actual object according to the setting of the position and posture of the virtual face, the first cost or both of the first and the second cost corresponding to the elastic energy of the virtual spring is evaluated even higher, and the total cost is also evaluated higher.
As a result, it is able to search the position and posture of the virtual face so as to approximate the virtual point located inside the actual object to the surface of the actual object faster or stronger. Further, a future contact state of the virtual object with respect to the actual object can be estimated precisely and at a high speed.
According to the contact state estimating apparatus of the present invention, it is preferable that one or both of the first and the second coefficient functions being defined as a function which is 0 in the negative definition domain or a definition domain less than a negative designated value.
According to the thus configured contact state estimating apparatus, the cost of the virtual point outside the actual object or the virtual point which is away from the surface of the actual object to some extent is uniformly estimated as “0”. Thereby, the calculation load required for searching processing of a combination of coordinate values of virtual points that brings the total cost closer to the smallest value or the minimum value can be reduced.
According to the contact state estimating apparatus of the present invention, the second deviation as the main variable of the second coefficient function may be defined as a parallax residual of the pair of image sensors as a result of the difference of the designated physical quantity imaged by each of the pair of image sensors being converted, according to a restraint condition that the designated physical quantity of a same part in the real space are same, or defined as a distance residual corresponding to the parallax residual.
According to the contact state estimating apparatus of the present invention, the range image sensor may be configured to obtain a distance image composed of a plurality of pixels having a distance to the actual object as a pixel value, and the first processing element is configured to allocate a distance according to the pixel value of each pixel composing the distance image as a pixel value with respect to the plurality of pixels composing the target region of the one of the images, according to a relative arrangement relation of the one of the image sensors and the range image sensor.
According to the contact state estimating apparatus of the present invention, it is preferable that one or both of the first and the second coefficient functions being defined as a function which is a positive designated value in the negative definition domain or a definition domain less than a negative designated value.
According to the contact state estimating apparatus of the present invention, it is preferable that the first processing element determines a standard point in a region where the target region is projected to the surface of the virtual object, selects an actual point within a predetermined range from the standard point or an actual point corresponding to the standard point, and determines the first deviation.
(Configuration) A contact state estimating apparatus illustrated in
The contact state estimating apparatus may comprise as its element, three or more image sensors which are able to obtain an image having a same designated physical quantity as a pixel value, and the computer 20 may be configured to select two among the three or more image sensors as the pair of image sensors.
As the configuration of the legged mobile robot, the configuration proposed by the present applicant in Japanese Patent No. 3674788, or the like can be adopted. Furthermore, the contact state estimating apparatus may be used to estimate a contact state between a palm of a robot arm or a part of the arm and a target object. Moreover, the contact state estimating apparatus may be installed in a vehicle and be used for estimation of a contact state between a tire of the vehicle and a road face.
The range image sensor 10 is, for example, a Time-of-Flight (TOF) range image sensor, and obtains a primary distance image in which each pixel having a distance measurement value Ds′ (refer to
The standard image sensor 11 is one of the cameras (for example, the right side camera) of a visible light stereo camera, and obtains a standard image in which each pixel at least has a luminance (designated physical quantity) as the pixel value. An optical axis direction of the standard image sensor 11 is defined as a Z axis direction (refer to
The reference image sensor 12 is the other camera (for example, the left side camera) of the visible light stereo camera, and obtains a reference image in which each pixel at least has a luminance as the pixel value similar to the standard image. “A reference image coordinate system” is defined by an imaging elements group arranged on an imaging area or a plane of the reference image sensor 12.
Camera parameters (inside parameter and outside parameter) of each of the range image sensor 10, the standard image sensor 11, and the reference image sensor 12 are known and stored in a memory of the computer 20 composing the contact state estimating apparatus. For example, a rotational matrix, a translational matrix, or a quaternion equivalent thereto expressing a coordinate transformation between the primary distance image coordinate system and the standard image coordinate system are stored in the memory. Similarly, a rotation matrix, a translation matrix, or a quaternion equivalent thereto expressing a coordinate transformation between the standard image coordinate system and the reference image coordinate system are stored in the memory.
In a case where the contact state estimating apparatus is mounted on a robot, a position and posture of each of the primary distance image coordinate system, the standard image coordinate system, and the reference image coordinate system with respect to a robot coordinate system is calculated according to a forward kinematics model expressing a behavior of the robot, and then stored in the memory.
In the robot coordinate system, a center of mass (for example, included in a body) of the robot is defined as an origin, an upward of the robot is defined as +x direction, a right direction is defined as +y direction, and a forward direction is defined as +z direction. A position and posture of the robot coordinate system in a world coordinate system are, for example, defined by an action plan of the robot.
The programmable computer 20 composing the contact state estimating apparatus includes a first processing element 21 which is configured to execute arithmetic processing described later having image signals from each of the range image sensor 10, the standard image sensor 11, and the reference image sensor 12 as a processing target, a second processing element 22, and a third processing element 23. A single processor (an arithmetic processing unit) may function as these three arithmetic processing elements 21 to 23, or a plurality of processors may function as these three arithmetic processing elements 21 to 23 in a coordinated fashion through mutual communication.
Each arithmetic processing element being “configured” to execute arithmetic processing in charge means “programmed” so that an arithmetic processing unit such as a CPU composing each arithmetic processing element reads software in addition to necessary information from a memory such as ROM, RAM, or the like, or a recording medium and executes the arithmetic processing to the information in accordance with the software.
(Functions)
The following describes a contact state estimating method executed by the thus configured contact state estimating apparatus. In the present embodiment, a contact state of a future rear face (a virtual face) of a foot of the robot and a floor face (actual object surface) is estimated. In addition, in a case where the robot has a hand at the end of the arm, a contact state of the palm face of the hand or a surface of a finger portion and the actual object such as a cup or the like which is an object of a task to be executed by the robot, may be estimated.
“A secondary distance image” composed of a plurality of pixels having a measurement value Ds (s) by the range image sensor 10 as the pixel value is acquired (FIG. 3/STEP02). “s” means a pixel position (a position of a quantized point) included in a region of interest (ROI) which is the target region in the standard image, and allocated with the distance measurement value Ds, among pixel positions in the standard image coordinate system, to be precise, it means a coordinate value thereof.
In particular, “the primary distance image (distance image)” is acquired through the range image sensor 10. The plurality of pixels s″ composing the primary distance image (refer to
Next, point position s″ which is a result of coordinate transformation of pixel position s′ of the primary distance image coordinate system to the standard image coordinate system, is acquired (refer to
Based on the rotational matrix R and the translational matrix T expressing the coordinate transformation of the primary distance image coordinate system to the standard image coordinate system, a vector ̂p″=R̂p′+T which expresses a position of the observation point Ps based on the standard image coordinate system is calculated. The rotational matrix R and the translational matrix T are stored in the memory in advance. The rotational matrix R and the translational matrix T may be defined by a quaternion which is mathematically equivalent thereto.
Based on Zs″ which is the vector ̂p″ and a depth direction component thereof (a component in the Z direction perpendicular with respect to the standard image coordinate system which is an X-Y coordinate system), coordinate position s″=(1/Zs″)̂p″ which corresponds to the observation point Ps in the standard image coordinate system is obtained.
The coordinate position s″ (refer to the white circle of
Then, norm Ds(s) of a vector ̂p=Dŝe(s) expressing the position of the observation point Ps based on the secondary distance image coordinate system is allocated as the pixel value with respect to each pixel position s of the standard image coordinate system. ̂e(s) is a unit vector indicating a line of sight direction toward the observation point Ps which passes the pixel position s of the secondary distance image coordinate system.
However, taking into account of the difference or the like of each of the resolution of the range image sensor 10 and the standard image sensor 11, it is not necessary to allocate the distance measurement value to all of the pixel position of the standard image coordinate. By this, the secondary distance image is acquired.
Furthermore, “a standard image” composed of a plurality of pixels having at least the luminance (designated physical quantity) as the pixel value is acquired through the standard image sensor 11 (FIG. 3/STEP04).
Moreover, “a reference image” composed of a plurality of pixels having at least the luminance as the pixel value similar to the standard image is acquired through the reference image sensor 12 (FIG. 3/STEP06).
A sequence of processing explained next is executed with respect to the secondary distance image, the standard image, and the reference image of the same time which were acquired at the same time and stored in the memory. Here, in a case the range image sensor 10, the standard image sensor 11, and the reference image sensor 12 are not completely synchronized, the secondary distance image, the standard image, and the reference image of the same time may be acquired at the each of the slightly different or approximately the same time.
First, the foot rear face (the surface of the virtual object) of the robot for the pixel position s of the standard image coordinate system is set as the “virtual face” (FIG. 3/STEP08). Particularly, by setting a standard position (vector) ̂q, of an i-th virtual point on the virtual face, a standard position and a standard posture of the foot rear face in a standard coordinate system are set. In a case where the foot rear face is a planner shape, the standard position ̂q1 is defined by expression (001).
̂q1=t(x1, y1, 1) (001)
Here, t denotes transposition. (x1, y1,) denotes a pixel position having distance measurement value z1 as the pixel value in the secondary distance image coordinate system. An area and a shape of the range defined by standard position group Q=(̂q1, . . . ̂q1, . . . ̂qn) is determined unambiguously according to the distance in addition to an area and a shape of the foot rear face stored in the storage unit.
On the other hand, the standard position and the standard posture of the foot rear face can be arbitrarily changed. For example, by uniformly adding or subtracting a given value to or from an x component of the standard position ̂q1, the standard position of the foot rear face in the standard coordinate system may be changed to the extent of the given value in the x direction. Furthermore, by uniformly multiplying factor cos θ or sin θ expressing a rotation about the z axis for a given angle θ to each of the x component and the y component of the standard position ̂q1, the standard posture in the xy plane of the foot rear face in the standard coordinate system can be changed.
In addition, a plane parameter ̂m(s)=t(m1, m2, m3) expressing the position and the posture of the foot rear face in the standard coordinate system is set. The initial value of the plane parameter ̂m(s) may be arbitrary, however, as will be described later, a current value of the plane parameter ̂m(s) is set by correcting a previous value.
A three-dimensional orthogonal coordinate system having a direction orthogonal with respect to the secondary distance image coordinate system (two dimensional) as the z direction is adopted as the standard coordinate system (refer to
As the standard coordinate system, the robot coordinate system or the world coordinate system may be used. For instance, when the range image sensor 10 is mounted on the robot, the position and the posture of the distance image coordinate system with respect to the robot coordinate system is calculated in accordance with a forward kinematic model representing the behavior of the robot, and then stored in the storage device. The position and posture of the distance image coordinate system with reference to the robot coordinate system is defined by a translational matrix and a rotational matrix or a quaternion equivalent thereto.
In the robot coordinate system, the mass center (e.g., included in a body) of the robot is defined as the origin, the upward of the robot is defined as +x direction, the right direction is defined as +y direction, and the forward is defined as +z direction (refer to
A coordinate value in the three-dimensional orthogonal coordinate system with reference to the secondary distance image coordinate system undergoes coordinate transformation using the matrix stored in the storage device, whereby a coordinate value in the robot coordinate system can be calculated. A coordinate value in the three-dimensional orthogonal coordinate system with reference to the secondary distance image coordinate system undergoes coordinate transformation using the matrix and the action plan stored in the storage device, whereby a coordinate value in the world coordinate system can be calculated.
The plane parameter ̂m is defined according to expression (002) based on the shape Q of the foot rear face, actual point coordinate values group Z=t(z1, . . . , z1, . . . zn), and n-th unit matrix I.
m=(QItQ)−1QIZ (002)
Then, each component of vector ̂Z calculated according to expression (003) is assumed as i-th virtual point coordinate value Pc(s) (to be precise, the z component Zc(s) thereof) (refer to
̂Z=tQm (003).
The surface shape of the virtual object can be changed to a plane as well as a curved face in any shape by a user of the apparatus of the present invention. For instance, a curved face parameter m=t(m11, m12, m21, m22, m3) is set, whereby the shape of the virtual object may be set so as to have a curved face represented by the expression (022) as the surface thereof:
m
11
x
2
+m
12
x+m
y1
y
2
+m
22
y+m
3
z=α (022)
In this case, for instance, a contact state between a palm whose surface shape is represented by the curved face parameter or the expression (022) and an object such as a handrail can be estimated.
The first processing element 21 calculates a first coefficient w1(s) in accordance with a first coefficient function w1(e1) based on a first deviation e1(s) determined according to the width of an interval |Ds(s)−Dc(s)| between the actual point Ps(s) and the virtual point Pc(s) (refer to FIG. 3/STEP12).
For example, distance residual |Zs(s)−Zc(s)| between an actual point distance Zs(s) and a virtual point distance Zc(s) is used as the first deviation e1(s) (refer to
“Actual point” means point Ps(s)=Zŝs, the real space position of which is determined according to the pixel value Ds(s) of the pixel position s=(u, v) in the secondary distance image coordinate system (corresponding to the target region which is a part of the standard image coordinate system) (refer to
Various distances defined unambiguously from a geometrical relation according to the size of the distance residual |Ds(s)−Dc(s)| may be used as the first deviation e1(s). For instance, instead of the interval |Ds(s)−Dc(s)| itself, an interval between a point as a result of projecting the actual point Ps(s) to the virtual face in the Z direction and the virtual point Pc(s) in a designated direction may be used as the first deviation e1(s). Moreover, an interval between a point as a result of projecting the virtual point Pc(s) on a plane corresponding to the object to be imaged in the z direction and the actual point Ps(s) in the designated direction may be used as the first deviation e1(s).
The first coefficient w1(e1(s)) is calculated according to the first coefficient function (dependent variable) w1(e1) defined by expression (101) having the first deviation e1 as the main variable. This first coefficient function w1(e1) is an increasing function of the first deviation e1 as shown in
w
1(e1)=log(1+exp(αe1−β)), (α>0, β>0) (101)
The first coefficient w1(e1(s)) may be calculated according to the first coefficient function w1(e1) defined by expression (102). This first coefficient function w1(e1) has a value range of 0 or more and less than a positive designated value ε1 in the negative definition domain, while has a value range of a positive designated value ε1 or more in the definition domain of 0 or more as is shown in
w
1(e1)=0 (if e1≦−ε2), or w1(e1)=(ε1/ε2)e1+ε1 (if −ε2<e1) (102)
The first coefficient w1(e1(s)) may be calculated according to the first coefficient function w1(e1) defined by expression (103). This first coefficient function w1(e1) is 0 in the definition domain less than 0 while is an increasing function in the definition domain equal to or more than 0 as shown in
w
1(e1)=0 (if e1≦0), or w1(e1)=ε0e1, (0<ε0) (if 0<e1) (103)
The first processing element 21 calculates a first cost E1 in accordance with a first cost function E1(e1) (which corresponds to “a first cost function” of the present invention) on the basis of the first coefficient w1(e1(s)) and the first deviation e1(s) (FIG. 3/STEP12).
The first cost function E1(e1) is defined as a product of the first coefficient function w1(e1) and the square of the absolute value |e1| of the first deviation, for example, indicated by expression (110). Therefore, the first cost E1 corresponds to a total of elastic energy of a virtual spring group shown in
E
1(e1)=w1(e1)|e1|2 (110)
The first coefficient w1 is calculated according to the first coefficient function w1(e1) based on the first deviation e1 (refer to FIG. 3/STEP10), and apart from this, the value of the power (square) of the absolute value |e1| of the first deviation may be separately calculated, and then such calculation results may be multiplied for the calculation of the first cost E1 (refer to FIG. 3/STEP12). Alternatively, the first cost E1 may be calculated according to the first cost function E1(e1) based on the first deviation e1.
The first cost E1 is calculated according to expression (120) showing a sum of the value of the first cost function E1(e1) for pixels s belonging to the target region (secondary distance image coordinate system) in the standard image coordinate system. Σs means a sum of pixels s belonging to the target region (secondary distance image coordinate system) in the standard image coordinate system.
E
1=ΣsE1(e1(s)) (120)
Since the first deviation e1(s) is a function having the plane parameter ̂m(s) of the virtual face as a variable, the first cost E1 calculated based on the first deviation e1(s) is a function having the plane parameter ̂m(s) of the virtual face as the variable.
In a case where the first coefficient function w1(e1) is defined according to expression (101) (refer to
In a case where the first coefficient function w1(e1) is defined according to expression (102) (refer to
In a case where the first coefficient function w1(e1) is defined according to expression (103) (refer to
The second processing element 22 generates a transformed image by a coordinate transformation of the reference image to the standard image coordinate system under the assumption that each of the standard image sensor 11 and the reference image sensor 12 is taking an image of the same virtual face (FIG. 3/STEP14). Particularly, an image allocated with a luminance of a pixel position Sref in a case a parallax according to the position and the posture of the virtual face exists among the reference image coordinate system, with respect to the pixel position s of the standard image coordinate system, is obtained as the transformed image.
The second processing element 22 calculates a second cost E2 in accordance with a second cost function E2(e1, e2) on the basis of a second coefficient w2(e1(s)) and a second deviation e2(s) (FIG. 3/STEP16). A deviation of a designated physical quantity which is a pixel value of the same pixel position s in the standard image coordinate system and the transformed image coordinate system, for example, the luminance residual ΔI(s) is used as the second deviation e2(s). A physical quantity other than the luminance obtained through the visible light camera such as a color (RGB value) obtained through the visible light camera or a temperature obtained through an infrared light camera or the like may be used as the designated physical quantity.
The second cost function E2(e1, e2) is defined as a product of the second coefficient function w2(e1) and the square of the absolute value |e2| of the second deviation, for example, as shown by expression (210). Therefore, the second cost E2 corresponds to elastic energy of a virtual spring which approximates the virtual point to the actual point having the second coefficient w2(e1) as the spring coefficient, and according to the restraint condition that “the designated physical quantity (in this case, luminance) of a same location imaged by each of the standard image sensor 11 and the reference image sensor 12 at the same time, is same”. (refer to
E
2(e1, e2)=w2(e1)|e2|2 (210)
The second coefficient w2 is calculated according to the second coefficient function w2(e1) based on the first deviation e1 (refer to FIG. 3/STEP10), and apart from this, the value of the power (square) of the absolute value |e2| of the second deviation is separately calculated, and then such calculation results are multiplied for the calculation of the second cost E2 (refer to FIG. 3/STEP16). Alternatively, the second cost E2 may be calculated according to the second cost function E2(e1, e2) based on the first deviation e1 and the second deviation e2.
The second cost E2 is calculated according to expression (220) showing a sum total of the value of the second cost function E2(e1, e2) for pixels s belonging to the target region (secondary distance image coordinate system) in the standard image coordinate system.
E
2=ΣsE2(c1(s), c2(s)) (220)
Similar to the first deviation e1(s), since the second deviation e2(s) is a function having the plane parameter ̂m(s) of the virtual face as a variable, the second cost E2 calculated based on the first deviation e1(s) and the second deviation e2(s) is a function having the plane parameter ̂m(s) of the virtual face as the variable.
The second coefficient function w2(e1) may be the same as the first coefficient function w1(e1), or may be different. For example, both of the first coefficient function w1(e1) and the second coefficient function w2(e1) may be defined according to expression (101). The first coefficient function w1(e1) may be defined according to expression (101), while the second coefficient function w2(e1) may be defined according to expression (102). The second deviation e2 may be used as the main variable of the second coefficient function w2 instead of the first deviation e1.
The luminance residual ΔI(s) is obtained according to the aforementioned restrain condition. For the sake of ease, a case in which the standard image sensor 11 and the reference image sensor 12 are parallel stereo type having the same internal parameters and are arranged so that the optical axes thereof are parallel to each other.
In a case the standard image sensor 11 and the reference image sensor 12 are not in a parallel stereo relation, as is shown in
Δu=L{(1/Zs)−(1/Zc)} (201)
“u” is a coordinate value expressing a position in the lateral direction in the standard image coordinate system (or the reference image coordinate system). In a case of a parallel stereo, the direction of the u axis is parallel to an epipolar line. The luminance residual ΔI of the standard image and the reference image under the assumption that a parallax residual Δu exists, is expressed by the expression (202) taking into consideration an epipolar restrain condition (refer to
ΔI=(∂I/∂u)̂cep̂Δu (202)
̂eep is a unit vector denoting the epipolar line direction in the standard image coordinate system. (∂I/∂u) is a vector denoting a luminance gradient. Especially in a case of a parallel stereo, the luminance gradient only in the u axis direction is effective.
The expression (212) shows that the parallax residual Δu(s) is converted to luminance residual ΔI(s) according to the restraint condition. The parallax residual Δu(s) is an interval between a position as a result of projecting pixel s of the standard image coordinate system (secondary distance coordinate system) to the reference image coordinate system according to the distance measurement value Ds and a position as a result of projecting pixel s of the standard image coordinate system to the reference image coordinate system according to the distance candidate value Dc.
Instead of the luminance residual ΔI(s), the parallax residual Δu calculated by converting the luminance residual ΔI(s) according to expression (202) or the distance residual ΔZ(s) calculated by converting the luminance residual ΔI(s) according to expressions (202) (203) may be used as the second deviation e2(s). Expression (203) is obtained by transforming expression (201) according to expressions Δu=(du/dZc) ΔZ and ΔZ=Zs−Zc.
Δu(s)=−(L/Zc2)ΔZ(s) (203)
The luminance residual ΔI(s) in a case distance residual ΔZ(s)=Zs(s)−Zc(s) exists may be calculated according to expressions (202) and (203).
The third processing element 23 calculates a linear sum of the first cost E1 and the second cost E2 expressed by expression (301) or (302) as a total cost E. As described above, since both of the first cost E1 and the second cost E2 are functions having the plane parameter ̂m of the virtual face as the variable, the total cost E is also defined as a function E(̂m) of the plane parameter ̂m.
E(̂m)=E1(̂m)+E2(̂m) (301)c
E(̂m)=χE1(̂m)+(1−χ)E2(̂m), (0<χ<1) (302)
Then, the third processing element 23 searches for a plane parameter ̂m which makes the total cost E to be the minimum value according to a least-square method or a gradient method which sequentially changes the plane parameter of the virtual face for an amount according to (∂E(̂m)/∂̂m) (FIG. 3/STEP18). By this, the position and the posture of the virtual face (foot rear face) are estimated.
According to whether or not the plane parameter ̂q satisfies a certain convergence condition such as a difference between a previous value and a current value of the total cost E is equal to or less than a threshold value or the like, it is determined whether or not the search of the virtual face is terminated (FIG. 3/STEP 20). In a case where the determination result is negative (FIG. 3/STEP 20 . . . NO), a current plane parameter ̂m (k+1) as a result of updating a previous plane parameter ̂m(k) (k denotes an index expressing the number of times of updating the plane parameter) according to the gradient method, is set (FIG. 3/STEP08). And then, the aforementioned sequence of processing is repeated (FIG. 3/STEP 10 to STEP 20).
On the other hand, in a case the determination result is positive (FIG. 3/STEP 20 . . . YES), the third processing element 23 estimates the position and posture of the virtual face defined by the plane parameter ̂m at that time point as a position and a posture of the foot rear face or a contact face of the foot rear face and the floor face (FIG. 3/STEP22).
According to this, for example, a state in which the foot sole rear face is contacting the floor face as shown in each of
(Effects)
According to the contact state estimating apparatus and method of the present invention, the first deviation e1(s) is defined so as the magnitude thereof is determined according to the magnitude of the distance residual |Zc(s)−Zs(s)| between the actual point Pc and the virtual point Ps (refer to
The second deviation e2 is defined so as the magnitude thereof is determined according to the magnitude of the luminance residual (difference of the designated physical quantity) between the pixel s of the standard image and the pixel of the reference image corresponding to the pixel s of the standard image in a form according to the position and the posture of the virtual face provisionally set (refer to
The first cost function (the first cost function) E1(e1) is defined as a dependent variable which shows a smallest value or a minimum value in a case where the value of the main variable e1 is 0, and which becomes an increasing function in the positive definition domain (refer to expression (110) and
The first cost E1(e1) corresponds to the total of elastic energy, having the first deviation e1(s) as the deformation amount, of a virtual spring group having the value of the first coefficient function w1(s) in each pixel s included in the target region (secondary distance image) of the standard image (one of the images) as the spring coefficient (refer to
In a case where a state that at least a part of the virtual object enters into the actual object is set, the first deviation e1(s) of the pixel s of a part of or all of the target region of the standard image becomes a positive value, and the second deviation e2(s) becomes a positive or a negative value, and not 0. Therefore, as a virtual point is at a deeper location inside and away from the surface of the actual object according to this setting, the first cost E1 and the second cost E2 are evaluated higher, and the total cost E is also evaluated higher.
Accordingly, by searching the position and the posture of the virtual face so as to enable to approximate the total cost E to the smallest value or the minimum value, the position and the posture of the virtual face that all of the virtual points group on the virtual face is not inside the actual object and at least a part of the virtual points group is positioned on the surface of the actual object can be estimated as the contact state of the virtual face with respect to the surface of the actual object (refer to
Accordingly, a situation of an infeasible state, where a common real space is occupied at the same time by each of the actual object (floor) which is the imaging target and other actual object (foot) corresponding to the virtual object, being searched as the contact state of the two actual objects can be surely avoided (refer to
The first coefficient function w1(e1) is defined so that even if the absolute value of the main variable e1 is same, the value in the case where the main variable e1 is a positive value is larger than in the case where the main variable e1 is a negative value (refer to expressions (101) to (103) and
According to this, the first cost E1 is calculated according to the first cost function E1(e1) which is asymmetric in the positive and negative definition domains, and the second cost E2 is calculated according to the first cost function E1(e1) which is asymmetric in the positive and negative definition domains (refer to
Therefore, the position and the posture of the virtual face can be searched in a form of giving priority to bring the virtual point located inside the actual object closer to the surface of the actual object than to bring the virtual point located outside the actual object closer to the surface of the actual object. Thereby a situation of an infeasible state where a common real space is occupied at the same time by each of one actual object which is the imaging target and other actual object corresponding to the virtual object is searched as the contact state of the two objects can be surely avoided. Further, a future contact state of the surface of the other actual object with respect to the surface of the one actual object can be estimated precisely.
One of or both of the first coefficient function w1(e1) and the second coefficient function w2(e1) is defined as an increasing function at least in the positive definition domain. According to this, the spring coefficient of the virtual spring becomes a larger value as the deformation amount of the spring is larger. Therefore, as a certain virtual point is positioned at a deeper location inside and away from the surface of the actual object according to the setting of the position and the posture of the virtual face, both of the first cost E1 and the second cost E2 corresponding to the elastic energy of the virtual spring is evaluated higher, and the total cost E is also evaluated higher.
Therefore, it is able to search the position and the posture of the virtual face so as to approximate the virtual point located inside the actual object to the surface of the actual object faster or stronger. By this, a future contact state of the other actual object with respect to the one actual object can be estimated precisely and at a high speed.
One of or both of the first coefficient function w1(e1) and the second coefficient function w2(e1) is/are defined as a function which is 0 in the negative definition domain or the definition domain less than the negative designated value (refer to expressions (102) (103),
As is shown in
Specifically, an actual point which gives a pixel value Ds of a representative point of a center or a gravity center or the like of ROI which is a part of a primarily distance image coordinate system shown in
Then, a virtual face is set so that a position of the representative point such as the center or the gravity center or the like of the virtual face coincides with a position of the standard point, and a solid encompassing the virtual face is defined as the mask. As is shown in
As is shown in
Then, according to the extrapolation interpolation processing, actual points (refer to ∘) in the other region encompassed in the mask is supplemented. As a result, even if the standard point exists in the lower level portion and an occlusion derived from the level difference in the proximity of the standard point is generated, a contact state of the foot rear face and the floor face can be estimated with high accuracy (refer to
The first coefficient function w1(e1) defined by expression (102) (refer to
Further, a plurality of combinations of a standard position and a standard posture of the foot rear face and a plane parameter or a curved face parameter may be assumed, and in accordance with each of the combinations, a plurality of future contact states between the actual object and the other actual object corresponding to the virtual object may be estimated.
Number | Date | Country | Kind |
---|---|---|---|
2012-082013 | Mar 2012 | JP | national |