HARDWARE-IN-THE-LOOP SIMULATION SYSTEM AND METHOD FOR COMPUTER VISION

Information

  • Patent Application
  • 20080050042
  • Publication Number
    20080050042
  • Date Filed
    November 20, 2006
    18 years ago
  • Date Published
    February 28, 2008
    16 years ago
Abstract
The disclosure relates to a hardware-in-the-loop simulation system and method for computer vision. An embodiment of the disclosed system comprises a software simulation and a hardware simulation. The software simulation includes a virtual scene and an observed object that are generated by virtual reality software. The virtual scene images are obtained at different viewpoints. The hardware simulation includes the virtual scene images being projected onto a screen by a projector, wherein the projected scene images are shot by a camera, and where in the direction of the camera is controlled by a pan-tilt.
Description

BRIEF DESCRIPTION OF THE DRAWINGS

The accompanying drawings, which are incorporated in and constitute a part of specification, illustrate an exemplary embodiment of the present invention and, together with the general description given above and the detailed description of the preferred embodiment given below, serve to explain the principles of the present invention.



FIG. 1 illustrates a hardware configuration of a hardware-in-the-loop simulation system according to an embodiment of the present invention;



FIG. 2 illustrates three projection transformation models of a hardware-in-the-loop simulation system for computer vision;



FIG. 3 illustrates an embodiment of a linear pinhole camera model with a camera coordinate system and a world coordinate system;



FIG. 4 is a flowchart of a stereo vision simulation processing using the hardware-in-the-loop simulation system in accordance with an embodiment of the present invention;



FIG. 5 illustrates the geometry of the pinhole camera with perspective projection;



FIG. 6 illustrates an initial default state of a virtual camera, i.e. the initial default relationship between the virtual world coordinate system and the virtual camera coordinate system;



FIG. 7 illustrates a calibration pattern of an embodiment.





DETAILED DESCRIPTION

While the claims are not limited to the illustrated embodiments, an appreciation of various aspects of the present invention is best gained through a discussion of various examples thereof. Referring now to the drawings, illustrative embodiments will be described in detail. Although the drawings represent the embodiments, the drawings are not necessarily to scale and certain features may be exaggerated to better illustrate and explain an innovative aspect of an embodiment. Further, the embodiments described herein are not intended to be exhaustive or otherwise limiting or restricting to the precise form and configuration shown in the drawings and disclosed in the following detailed description.


The system is used for a hardware-in-the-loop simulation of stereo vision. First, a description will be given below of a configuration of a hardware-in-the-loop simulation system of computer vision. FIG. 1 illustrates the hardware configuration of the hardware-in-the-loop simulation system. The system comprises a projector 1, a camera 2, a pan-tilt 3 for camera 2, a projection screen 4, and computers 5 and 6. The projector 1 is fixed upon a ceiling 7 such as, for example, with a hanger. The pan-tilt 3 and projection screen 4 are fixed to the wall by steady brackets or on the wall 8 directly. The projector 1 may be a common off-the-shelf product and is configured to communicate with the computer 5 in which virtual reality, hereinafter “VR”, software is installed for generating virtual 3D scene images. The virtual scene image is projected onto the screen 4. The camera 2 may be an off-the-shelf CCD camera which is fixed on the pan-tilt 3. The camera 2 is configured to communicate with the computer 6 with a frame grabber board. The computer 6 is also used to control the pan-tilt 3 for a suitable direction of camera to select a proper “FOV” (field of view) of camera 2. After positioning of camera 2 is determined, the pan-tilt 3 should not rotate any more during the use of the system. The principle of selecting lens of camera 2 is that the size of FOV of camera 2 is as same as possible as the size of image on the projection screen 4 and not larger than the image's. The parameters of the system should be calculated and calibrated for application of the simulation system, for example, the camera 2 should be calibrated in the computer vision. While the system is in the parameters calibration or work state, the positions and parameters of the projector 1, the camera 2, the pan-tilt 3 and the screen 4 must be fixed for high precision of simulation.


Referring to FIG. 2, in one embodiment, the system is comprised of three projection transformation models. These models include virtual camera model M1, projector imaging model M2 and camera model M3. The three models are operatively connected with each other in series, so that the output of one model is the input of another. The virtual camera model M1 is realized by VR software such as, for example, OpenGL. The virtual camera images under various conditions are acquired through setting corresponding parameters. The virtual camera model is a linear transformation, and the parameters of virtual camera can be calculated according to the function parameters of VR software. The projector imaging model M2 and the camera model M3 are realized by actual apparatuses; they are nonlinear plane-to-plane projection transformations. The hardware parts of the hardware-in-loop simulation system should be calibrated.


In this embodiment, the pinhole camera model that most algorithms of computer vision usually are based on is used. The virtual camera model is an ideal linear pinhole camera model. Referring to FIG. 3, the linear transformation from the world coordinate to the camera coordinate is described by the following equation,










[

Equation





1

]


























s
v



[




u
v






v
v





1



]


=



[




α
vx



0



u

v





0




0




0



α
vy




v

v





0




0




0


0


1


0



]



[




R
v




T
v






0
T



1



]




[




X
vw






Y
vw






Z
vw





1



]









=


M

1

v





M

2

v




[




X
vw






Y
vw






Z
vw





1



]









=


M
v



[




X
vw






Y
vw






Z
vw





1



]









(
1
)







Where (Xvw, Yvw, Zvw,1)T is the 3D world homogeneous coordinate with the subscript v indicating the virtual camera model, (uv, vv,1)T is the computer image homogeneous coordinate, αvx=fv/dxv and αvy=fv/dyv are the scale factors in the direction of X and Y axes of the virtual camera image, fv is effective focal length of virtual camera, dxv and dyv are the distance between adjacent sensor elements in X and Y directions respectively, (uv0, vv0) is the coordinate of the principal point, Sv is an arbitrary scale factor, Rv and Tv are the 3×3 rotation matrix and translation vector which relate the world coordinate system to the camera coordinate system. M1v is called the camera intrinsic matrix is determined by the intrinsic parameters αvx, αvy, uv0 and vv0 and M2v is called the camera extrinsic matrix is determined by the extrinsic parameters Rv and Tv. Mv is called the projection matrix of which the elements can be acquired according to the parameters of VR software.


Like the virtual camera model, the projector imaging model is described by the following equation.









[

Equation





2

]





















s
p



[




u
p






v
p





1



]


=



[




α
px



0



u

p





0




0




0



α
py




v

p





0




0




0


0


1


0



]



[




R
p




T
p






0
T



1



]




[




X
pw






Y
pw






Z
pw





1



]









=


M
p



[




X
pw






Y
pw






Z
pw





1



]









(
2
)







The definitions of the parameters in the above equation (2) are the same as those in equation (1) with the subscript p indicating the projector imaging model. Without loss of generality, it can be assumed that Zpw=0 because of the plane-to-plane projection transformation. So from equation (2), we have:









[

Equation





3

]





















s
p



m
p


=



[




α
px



0



u

p





0






0



α
py




v

p





0






0


0


1



]



[


r

1

p








r

2

p








T
p


]




[




X
pw






Y
pw





1



]









=


H
p




X
~

pw









(
3
)







Likewise, the equation of the actual camera model can be deduced through the same process as the projector imaging model. It is described by the following equation.









[

Equation





4

]





















s
c



[




u
c






v
c





1



]


=


s
c


m








=



[




α
cx



0



u

c





0






0



α
cy




v

c





0






0


0


1



]



[


r

1

c








r

2

c








T
c


]




[




X
cw






Y
cw





1



]








=


H
c




X
~

cw









(
4
)







The definitions of the parameters in the above equation (4) are the same as those in the equation (3) with the subscript c indicating the camera model.


The lens distortions of the camera and projector are not considered in equations (3) and (4). But a real-life camera usually exhibits lens distortion. So a nonlinear camera calibration technique should be adopted to calibrate the parameters or hardware part of the system, and the process of the calibration will be expounded in detail in the following section.


Referring to FIG. 4, in one embodiment, there are five steps to achieve the stereo vision hardware-in-the-loop simulation. First, the virtual object or scene is generated by VR software and the virtual scene images are acquired at two or several different viewpoints by setting the following parameters, the vertical FOV angle in degree θ, the aspect ratio of the width to height of the viewport (computer screen) Aspect, the distance from viewpoint to near and far clipping planes NearPlane and FarPlane, the position of viewpoint i.e. the 3D coordinates in the world coordinate system XPos, YPos, ZPos, and the yaw, pitch and roll angles θYaw, θPitch, θRoll. Second, the virtual scene images are projected onto the screen 4, respectively, by the projector 1. Third, the virtual 3D scene images of different viewpoints on screen 4 are respectively shot using the actual camera 2 to acquire the output images of the simulation system. Fourth, the parameters of virtual camera are calculated, and the actual camera and projector models are calibrated for high-accuracy parameters estimation. Fifth, the coordinates of virtual objects or scenes in the world coordinate system are obtained by using basic principle and correlative arithmetic of the stereo vision according to the camera images to confirm the correctness and evaluate the performance of the stereo vision algorithm.


The third and fourth steps in the FIG. 4 are the most important for simulation, because the correctness of simulation result depends on the validity of the parameters of the system that are calculated and calibrated. The process how to acquire the parameters of system is described in detail in the following text.


First, the intrinsic parameters of the virtual camera will be calculated according to the parameters of the projection transformation of the VR software, the FOV angle θ and the image resolution. Referring to FIG. 5, the following equation of trigonometric function may be used to describe the relationship between the set projection transformation parameter and the intrinsic parameters of virtual camera.









[

Equation





5

]

















tan


(

θ
2

)


=




h
2

*

dy
v



f
v


=

h

2






α
vy









(
5
)







Where h is the height of image in the Y direction in virtual camera image plane and its unit is the number of pixels. The scale factors αvy can be calculated according to the equation (5). The scale factor αvx and αvy in image X and Y axes are equal to each other, and the coordinate of the principal point (uv0, vv0) is the center of image, since the virtual camera is ideal pinhole camera. Then the intrinsic parameters matrix of the virtual camera is determined.


The extrinsic parameters of the virtual camera are calculated through the coordinate transformations. The extrinsic parameters Rv and Tv are relative to the parameters of the position of viewpoint and the yaw, pitch and roll angles θYaw, θPitch, θRoll used for model view projection of the VR software. The initial default state of the virtual camera coordinate system is illustrated in the FIG. 6. Referring to FIG. 6, the virtual world coordinate system is (X, Y, Z) and the virtual camera coordinate system is (Xc0, Yc0, Zc0). The viewpoint of the virtual camera is located at the origin of the world coordinate system, looking along the positive X-axis. So the rotation matrix from the world coordinate system to the initial default state of the virtual camera coordinate system Rv1 is determined as follows.







R

v





1


=

[



0


0


1




0



-
1



0




1


0


0



]





The virtual camera rotates in the order of roll, yaw and pitch from the initial default state with rotation matrix Rv2. Then the viewpoint moves from origin to the set position relative to the translation vector Tv to achieve the transformation from the virtual world coordinate system to the virtual camera coordinate system. The rotation matrix Rv2 can be acquired according to the Euler equation as the following equation.









[

Equation





6

]












R

v





2


=

[




cos





β





cos





γ




cos





β





sin





γ





-
sin






β







sin





α





sin





β





cos





γ

-

cos





α





sin





γ






sin





α





sin





β





sin





γ

-

cos





α





cos





γ





sin





α





cos





β







cos





α





sin





β





cos





γ

+

sin





α





sin





γ






cos





α





sin





β





sin





γ

+

sin





α





cos





γ





cos





α





cos





β




]





(
6
)







Where α=θPitch, β=θYaw, γ=θRoll. It can be obtained the rotation matrix Rv=Rv2*Rv1. On the assumption that the viewpoint coordinates in the world coordinate system is TvW, the translation vector Tv can be calculated by the following equation.


[Equation 7]






T
v=−(Rv*TvW)  (7)


With the above solution procedure, the parameters of virtual camera are acquired. Subsequently, the parameters of the projector imaging model M2 and the actual camera model M3 will be acquired by using the following calibration method.


In the application of this simulation system, there is no need to estimate the projector imaging model M2 and the actual camera model M3 respectively, i.e. there is no need to determine the transformation from the projector image to the projection screen and the transformation from the projection screen to the camera image respectively. Only the determination of the transformation relationship between the projector image and the camera image is needed. So the projector imaging model and the actual camera model are regarded as a whole module for calibration. A nonlinear camera calibration technique is adopted because of the lens distortions of the camera 2.


A calibration pattern is shown in FIG. 7. The image of a model plane contains a pattern of 4×4 squares, these corners are used for calibration. The calibration steps of the parameters are described as follows. First, the center of camera image is determined using a method of varying focal length presented by Lenz and Tsai (Lenz. R. K, Tsai. R. Y, Techniques for Calibration of the Scale Factor and Image Center for High Accuracy 3-D Machine Vision Metrology, IEEE Transactions on Pattern Analysis and Machine Intelligence. Volume 10. Issue 5. September 1988. Pages: 713-720). Then assuming the projector imaging model is a linear transformation, each four corresponding spatial collinear points on the planes of the projector computer image, the projection screen image and the camera coordinate system have the character of cross ratio invariability. The lens distortion calibration is achieved using the approach based on cross ratio invariability presented by Zhang Guangjun (“Machine Vision” 2005, written by Zhang Guangjun and published by Scientific Publishing Center, Beijing). Finally, based on the center of camera image and the lens distortion, the coordinate of the corner in calibration pattern in the camera coordinate system (X, Y) can be calculated according to the image coordinate in pixel (uc, vc). The transformation from camera image coordinate (X, Y) to the projector computer image coordinate (up, vp) is linear, which is described by the following equation.









[

Equation





8

]

















(




u
p






v
p





1



)

=

H
*

(



X




Y




1



)







(
8
)







Where His a 3×3 matrix. Based on the corners coordinates of calibration pattern, linear equation groups are set up according to the equation (8). The linear transformation matrix H can be estimated using the least-squares method.


As described above in detail, all of parameters of hardware-in-the-loop simulation system are acquired. The output images of simulation system are regarded as the images of a camera of which the intrinsic and extrinsic parameters have been known according to the parameters of simulation system. So the correctness of the system may be confirmed and the performance of stereo vision algorithm such as feature extraction, stereo matching and 3D reconstruction based on the output images and parameters may be evaluated. The 3D reconstruction precision based on the hardware-in-the-loop simulation system can reach 1% according to the results of simulation experiments.


In view of the foregoing, it will be seen that advantageous results are obtained. The hardware-in-the-loop simulation system can be used not only for stereo vision but also in the field of computer vision for the hardware-in-the-loop simulation of camera imaging of virtual objects or scenes. Furthermore, the system described herein can be used in other fields for the hardware-in-the-loop simulation of camera imaging.


Although the details of the present invention have been described above with reference to a specific embodiment, it will be obvious to those skilled in the art that various changes and modifications may be made without departing from the scope and spirit of the invention.


It is intended that the following claims be interpreted to embrace all such variations and modifications.


The foregoing description of various embodiments of the invention has been presented for purposes of illustration and description. It is not intended to be exhaustive or to limit the invention to the precise embodiments disclosed. Numerous modifications or variations are possible in light of the above teachings. The embodiments discussed were chosen and described to provide the best illustration of the principles of the invention and its practical application to thereby enable one of ordinary skill in the art to utilize the invention in various embodiments and with various modifications as are suited to the particular use contemplated. All such modifications and variations are within the scope of the invention as determined by the appended claims when interpreted in accordance with the breadth to which they are fairly, legally, and equitably entitled.

Claims
  • 1. A hardware-in-the-loop simulation system for computer vision comprising: a software simulation that includes a virtual scene and an observed object that are generated by a virtual reality software, wherein virtual scene images of the scene at different viewpoints are obtained; anda hardware simulation that includes the virtual scene images being projected onto a screen by a projector, wherein the projected scene images are shot by a camera, and wherein the direction of the camera is controlled by a pan-tilt.
  • 2. The hardware-in-the-loop simulation system according to claim 1, wherein the camera is fixed on the pan-tilt, the projector and the pan-tilt are fixed to a wall.
  • 3. The hardware-in-the loop simulation system according to claim 2, wherein when the system is in a calibration state, positions and parameters of the projector, the camera, the pan-tilt and the projection screen are fixed.
  • 4. The hardware-in-the loop simulation system according to claim 2, wherein the pan-tilt and the projector are fixed directly to the wall.
  • 5. The hardware-in-the-loop simulation system according to claim 1, wherein the said projector connects to a first computer and projects the virtual scene image generated by the virtual reality software installed in the computer onto the projection screen; and wherein the camera sends gathered image data to a second computer by a frame grabber board and the second computer is configured to control the pan-tilt to select a proper direction for the camera.
  • 6. A hardware-in-the-loop simulation method for computer vision comprising: generating virtual objects or scenes by virtual reality software;setting parameters for the virtual reality software to acquire the image of the virtual objects or scenes;projecting the image onto a screen;taking the image of the virtual objects or scenes on the screen using an actual camera to acquire the result of a simulation;calculating parameters of a virtual camera, and calibrating models for the actual camera and a projector for a high-accuracy parameter estimation; andachieving computer vision experiments based on the result of the simulation.
  • 7. The hardware-in-the-loop simulation method according to claim 6, wherein the step of setting the parameters for the virtual reality software further comprise: setting a vertical field of view angle in degrees, an aspect ratio of a width to height of a viewport, a distance from the viewpoint to near and far clipping planes, coordinates in a world coordinate system of the viewpoint, and yaw, pitch and roll angles.
  • 8. The hardware-in-the-loop simulation method according to claim 6, wherein the step of calculating the parameters of the virtual camera is relative to the virtual reality software parameters; the intrinsic parameters of the virtual camera are calculated according to parameters of a projection transformation of the virtual reality software, and extrinsic parameters of the virtual camera are calculated according to parameters of a model view transformation of the virtual reality software.
  • 9. The hardware-in-the-loop simulation method according to claim 6, wherein in the step of calibrating the actual camera and projector models, the projector model and the actual camera model are regarded as a whole module for calibration.
  • 10. The hardware-in-the-loop simulation method according to claim 9, wherein a nonlinear camera calibration technique is used for high-accuracy parameter estimation of said whole module of hardware, comprising: determining the center of a camera image using a method of varying focal length;achieving lens distortion calibration using a calibration approach based on cross ratio invariability;calculating a coordinate of a corner of a calibration pattern in a camera coordinate system based on the center of the camera image and the lens distortion, according to an image coordinate in pixel;setting up linear equation groups according to a linear transformation model from the camera coordinate system to a projector computer image coordinate system; wherein a linear transformation matrix is estimated using a least-squares method to achieve the parameter estimation.
Priority Claims (2)
Number Date Country Kind
200610083637.7 May 2006 CN national
200610083639.6 May 2006 CN national