The present invention relates to a vehicle surrounding monitoring apparatus which is mounted on a vehicle and which synthesizes a plurality of captured images around the vehicle and provide a synthetic image to a driver.
For example, Patent Literature 1 discloses a conventional vehicle surrounding monitoring apparatus. This apparatus applies processing to images of the vehicle surrounding captured through wide-angle lenses of a plurality of car-mounted cameras (hereinafter, car-mounted cameras will be simply referred to as “cameras”) arranged such that the image capturing ranges partially overlap. According to this processing, captured images are converted into overhead images showing an object on a street seen from a driver's view point or a view point from above. The overhead images generated by conversion are displayed on a monitor to present to passengers of the vehicle, particularly, the driver.
The vehicle surrounding monitoring apparatus of this type generally converts a plurality of images captured by a plurality of cameras attached at different positions, into a plurality of overhead images and synthesizes a plurality of these overhead images to generate a synthetic image. When overhead images are synthesized, a method of determining the brightness of pixels by assigning weights to and adding pixels of respective overhead images in overlapping areas, is known as one of image processing methods for overlapping areas. More specifically, the brightness (p) of the target pixels is defined according to following equations 1 and 2 based on brightnesses (p1 and p2) and weights (w1 and w2) of respective overhead images in which overlapping areas are formed.
p=p1×w1+p2×w2 (Equation 1)
w1+w2=1 (Equation 2)
Patent Literature 2 proposes a method of determining a weight of target pixels according to the distance between a camera and a point corresponding to the target pixels. Hence, an image closer to the camera is preferentially used in the overlapping area, so that it is possible to generate an image of little deterioration of image quality.
Technical Problem
However, distortion of a three-dimensional object included in overhead images has to do with the view point set to generate overhead images, a projection plane and a state (position and orientation) where cameras which actually capture images are attached to a vehicle. Therefore, if these pieces of information are not quantized, the magnitude of distortion cannot be taken into account only based on the distance between the camera and a point corresponding to target pixels. That is, when weighting is performed based on an actual distance between the camera and point corresponding to the target pixels as disclosed in Patent Literature 2, while roughness of the pixels is taken into account in the weight for synthesis, the magnitude of distortion of a captured three-dimensional object cannot be taken into account and pixels of an overhead image including greater distortion is preferentially used (for example, with a greater weight).
It is therefore an object of the present invention to provide a vehicle surrounding monitoring apparatus which can reduce distortion of a three-dimensional object appearing in a synthetic image when an output image of a monitor is obtained by synthesizing overhead images based on images captured by a plurality of cameras.
A vehicle surrounding monitoring apparatus according to the present invention which is used with a plurality of image capturing sections which capture images of an area around a vehicle, has: an acquiring section which acquires data showing a plurality of images captured by the plurality of image capturing sections; and a synthesizing section which synthesizes a plurality of overhead images generated based on the acquired data, to obtain an output image, and, in overlapping areas of different overhead images corresponding to different image capturing sections, the synthesizing section synthesizes pixels of different overhead images based on a ratio determined according to an angle to look down from the different image capturing sections on a point corresponding to the pixels.
According to the present invention, it is possible to reduce distortion of a three-dimensional object appearing in a synthetic image when an output image of a monitor is obtained by synthesizing overhead images based on images captured by a plurality of cameras.
Hereinafter, embodiments of the present invention will be described with reference to the drawings.
In addition, although four cameras are set with the present embodiment, the number of cameras to set is by no means limited to four and may be two or more. Further, although cameras are set at the positions of the front, back, left and right of the vehicle with the present embodiment, these cameras may be set at random positions as long as they can adequately capture images around the vehicle (not necessarily the entire surrounding).
Image processing section 2 receives as input the captured image (hereinafter, also referred to as “camera image”) from each camera 11, processes these captured images and outputs the processed images to display section 3. This processing is directed to performing an arithmetic operation of creating view point conversion images and synthesizing illustration images and overhead images.
That is, image processing section 2 has as an acquiring section as an interface (not illustrated) which is connected to each camera 11 to acquire data showing camera images. Further, the above arithmetic operation in image processing section 2 is realized by a configuration including a computer such as a CPU (Central Processing Unit) which executes a program for the above arithmetic operation and a storage apparatus which stores information such as a table required for the above arithmetic operation. The vehicle surrounding monitoring apparatus according to the present embodiment mainly has image processing section 2.
For display section 3, a display device such as a liquid crystal display or plasma display is used. Meanwhile, the display is used in combination with a vehicle-mounted GPS (Global Positioning System) terminal display (so-called car navigation system display).
Overhead image converting section 21 which is a converting section performs signal processing of converting camera images taken in from cameras 11 into images which look down on the ground as the projection plane from a specified virtual view point. The overhead image is, for example, an image which vertically looks down on the ground from a virtual view point position. Processing of converting the camera image into the overhead image seen from the virtual view point is performed referring to mapping table 23. Mapping table 23 defines the correspondence between an input (a pixel coordinate of a camera image) and an output (a pixel coordinate of an overhead image) in advance, and will be described below. When generating an overhead image, overhead image converting section 21 acquires a brightness value of each pixel of the overhead image from pixels of a corresponding camera image. There is a plurality of camera inputs with the present embodiment, and therefore overhead image conversion processing is performed separately for each camera image.
Overhead image synthesizing section 22 which is a synthesizing section performs signal processing of generating one output image by synthesizing a plurality of overhead images generated by overhead image converting section 21, and outputting this output image to display section 3.
Meanwhile, in
Mapping table 23 is a table which associates correspondences between camera images and overhead images. The pixel coordinate of each camera image and the pixel coordinate of a corresponding overhead image are described as a pair.
Blend ratio table 24 shows how many pixels of which overhead image are used to synthesize one output image by overlapping two overhead images.
Hereinafter, a case as to synthesis of overhead images will be described as an example with reference to
With the present embodiment, camera 11d at the front of the vehicle is set at a position at a height h1 from the ground, and camera 11b on the right of the vehicle is set at a position at a height h2 (>h1) from the ground. When cameras 11b and 11d capture images of three-dimensional object 701 at a height h3 (<h1) positioned at a point P in vehicle surrounding area 900, three-dimensional object 701 appears as projected three-dimensional objects 911 and 912 projected on the ground in a state stretched in respective projection directions. With this example, camera 11d is set at a position lower than camera 11b, and therefore, projected three-dimensional object 911 in a captured image of camera 11d is more stretched than three-dimensional object 912 in a captured image of camera 11b (L1>L2). Although the length of the three-dimensional object appearing on an image changes when camera images are converted into overhead images by view point conversion, the relationship between the lengths of projected three-dimensional objects 911 and 912 before conversion is generally the same as the relationship between the lengths of projected three-dimensional objects 711 and 712 after conversion. That is, distortion of three-dimensional object 701 is more significant in an overhead image corresponding to camera 11d set at a lower position.
With the conventional example disclosed in Patent Literature 2, when the distances d1 and d2 between cameras 11d and 11b and the point P are given, the weights w1 and w2 of the overhead image at the point P are given according to following equations 3 and 4. In addition, the distance d1 can be calculated based on the world coordinate position of camera 11d, a height from the ground of camera 11d and the world coordinate position of the point P. The same applies to the distance d2.
w1=d2/(d1+d2) (Equation 3)
w2=d1/(d1+d2) (Equation 4)
With this conventional example, the weights w1 and w2 are applied according to above equations 3 and 4, so that pixels of one of two overhead images closer to the camera are preferentially utilized. That is, with this conventional example, weights are derived from the distance. Therefore, when the distances d1 and d2 between cameras 11d and 11b and the point P are equal, there is a problem. Generally, the positions to attach cameras 11b and 11d are actually different, therefore, even when the distances d1 and d2 are the same, the heights h1 and h2 are different. Hence, as described above, the sizes (stretches) of projected three-dimensional objects 711 and 712 produced from three-dimensional object 701 positioned at the position P are different. Meanwhile, when a synthetic image of overlapping areas is generated, an image of camera 11b of the least stretch, that is, at a higher position, is preferentially used to reduce distortion in the output image. However, with the conventional example, the weights w1 and w2 are derived from the equal distances d1 and d2, and therefore become equal. When the weights w1 and w2 of overhead images are determined based on the distances d1 and d2 from cameras 11d and 11b in this way, distortion (stretch) in the projection direction of three-dimensional object 701 cannot be taken into account and pixels corresponding to the point P in the two respective overhead images are synthesized using the same weights w1 and w2.
By contrast with this, with the present embodiment, by taking into account the positions to attach cameras, more particularly, the heights to attach the cameras, weighting is performed which reduces distortion in the projection direction of the three-dimensional object. The weight setting method according to the present embodiment will be described using
w1=θ2/(θ1+θ2) (Equation 5)
w2=θ1/(θ1+θ2) (Equation 6)
θ1=arctan(t1/h1) (Equation 7)
θ2=arctan(t2/h2) (Equation 8)
If θ1=θ2 holds, distortion (stretch in the projection direction) of the three-dimensional object at the point P becomes the same length, so that the weights w1 and w2 at the point P are the same. With this example, θ1>θ2 holds, and the point (point P) corresponding to the pixels of synthesis targets is at an equal distance (d1=d2) from cameras 11b and 11d, and the weight w2 of the overhead image corresponding to camera 11b at a position at which the angle to look down on the point P is sharper becomes greater. That is, when the pixels corresponding to the point P are synthesized, pixels of the overhead images corresponding to camera 11b set at a high position and having a little distortion of a three-dimensional object are preferentially utilized.
In addition, although only the point P is focused upon with the above example, conditions (such as distance and angle) at the point near the point P resemble the conditions at the point P, so that, even when the point near the point P is focused upon, it is possible to lead to the same conclusion as the case where the point P is focused upon.
That is, of projected three-dimensional objects 711 and 712 illustrated in
Next, a case will be described where the height to focus upon most is set as a parameter depending on reduction in distortion of which three-dimensional object at what height is focused upon, in the weight setting method according to the present embodiment. In
θ1=arctan(t1/(h1−H)) (Equation 9)
θ2=arctan(t2/(h2−H)) (Equation 10)
Using θ1 and θ2, weighting at the point P is expressed according to equations 5 and 6.
Although a setting of a height H to actually focus upon depends on a height to attach the camera, the height is set to about 50 to 80 cm. This height needs to be set lower than the position to attach the camera. Further, the height H to focus upon is made constant irrespectively of whether or not there is an object at the point P to calculate the weights. That is, weight setting processing according to the present embodiment does not need to detect an object, and is uniformly performed based on the predetermined height H irrespectively of whether or not there is an object. When there is no object, that is, when the ground is displayed in an overhead image, the ground at the same position in the overlapping areas is captured, and therefore weighting which assumes the height H to focus upon does not have a negative influence on a synthetic image.
In step S001, mapping table 23 will be referred to acquire coordinates of a camera image corresponding to coordinates xt and yt of an overhead image. Mapping table 23 has a list of coordinates of camera images associated with coordinates xt and yt of overhead images, so that it is possible to acquire coordinates xi and yi of the associated camera image.
In step S002, pixel values at coordinates xi and yi of the camera image is acquired to utilize these pixel values as pixel values at coordinates xt and yt of the overhead image.
In step S003, with processing of deciding whether or not all pixels required to generate overhead images are acquired, processings in step S001 and step S002 are repeated until all pixel values of overhead images are acquired.
The processing in
Next,
In step S011, an overhead image having the pixel values of coordinates xo and yo of the output image which is finally synthesized is selected.
In step S012, whether or not there is a plurality of overhead images for synthesizing pixels of coordinates xo and yo of the output image is decided, and, when there is one corresponding overhead image, the step proceeds to step S015. If there are two overhead images, it is decided that there are overlapping areas, and the step proceeds to step S013.
In S013, the pixel values of two overhead images corresponding to coordinates xo and yo of the output image are acquired.
In S014, a weight for synthesizing the pixel values of the overhead images acquired in step S013 is acquired from blend ratio table 24.
In step S015, the pixels of coordinates xo and yo of the output image are synthesized. When there are corresponding pixels in two overhead images, the overhead images are synthesized according to equation 1 based on the weight acquired in step S014. When there is only one overhead image, the pixels of coordinates xo and yo of this overhead image are used as is.
In step S016, with processing of deciding whether or not all pixel values required to generate an output image are acquired, processings in step S011 and step S015 are repeated until all pixel values are acquired.
By realizing synthesizing processing of the above output image at, for example, 30 frames per second, it is possible to realize proposed synthesis of a common movie.
According to this operation, the present invention can provide a vehicle surrounding monitoring apparatus which can synthesize images of little distortion in overlapping areas in which overhead images overlap.
Hereinafter, Embodiment 2 of the present invention will be described. The basic configuration of the vehicle surrounding monitoring apparatus according to the present embodiment is the same as in Embodiment 1, and therefore the detailed configuration will not be described.
The present embodiment differs from Embodiment 1 in utilizing a three-dimensional space model in processing of converting camera images into overhead images, and this difference will be mainly described.
A conventional three-dimensional space model is shown in, for example,
By contrast with this, with the present embodiment, as illustrated in
With the present embodiment using above front curved model 1500, processing of converting camera images into overhead images includes processing of mapping each camera image on front curved model 1500 and processing of converting the view point for each mapped image and performing projection again on the ground (horizontal plane).
When camera 11d captures an image of an area including the point P at which three-dimensional object 701 is positioned, this camera image is mapped on front curved model 1500 as illustrated in
Consequently, with the present embodiment, it is possible to further reduce distortion of a three-dimensional object in overhead images of the front of the vehicle. Referring to
Further, referring to
Consequently, according to the present embodiment, an output image is formed at the back of the vehicle by planar surface projection which facilitates confirmation of the distance perspective, and an output image is formed at the front of the vehicle by curved surface projection which facilitates confirmation of the surrounding. Generally, a synthetic image is displayed when a vehicle is going backward to drive the vehicle in, for example, a parking place. Hence, at the back of the vehicle which is the traveling direction of the vehicle, it is more necessary to faithfully reproduce the positional relationship of the vehicle and object positioned nearby than to secure a wide field of view. By contrast with this, at the front of the vehicle or at the side of the vehicle which is not the traveling direction of the vehicle, the opposite applies. The present embodiment simultaneously satisfies these demands.
Embodiments of the present invention have been described above. The above embodiments can be variously changed and implemented. Further, the above embodiments can be adequately combined and implemented.
The disclosure of Japanese Patent Application No. 2009-124877, filed on May 25, 2009, including the specification, drawings and abstract, is incorporated herein by reference in its entirety.
As described above, the vehicle surrounding monitoring apparatus according to the present invention is useful as, for example, a vehicle surrounding monitoring apparatus which is mounted on a vehicle, and which synthesizes a plurality of captured images around the vehicle and provides a synthetic image to, for example, a driver.
Number | Date | Country | Kind |
---|---|---|---|
2009-124877 | May 2009 | JP | national |
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/JP2010/003405 | 5/20/2010 | WO | 00 | 11/25/2011 |