The present invention relates to a technology of an image display device. The present invention claims priority from Japanese Patent Application No. 2014-066268 filed on Mar. 27, 2014, the entire contents of which are hereby incorporated by reference for the designated countries allowing incorporation by reference.
As the background art of this technical field, there is Japanese Patent Laid-open Publication No. 2009-289185 (Patent Literature 1). In this laid-open publication, there is disclosed “an image display device, comprising: designation means for designating a K-number (K: integer of 2 or more) of cameras each partially having a field of view in common; combining means for combining K-number of subject images respectively output from the K-number of cameras designated by the designating means by referring to a weighting assigned to each of the K-number of cameras; determination means for determining, in association with the designation processing by the designation means, whether or not a moving three-dimensional object is present in the field of view in common; first control means for controlling the weighting of the K-number of cameras in a fixed manner when a determination result by the determination means is negative; calculation means for calculating an amount of decrease in a distance to the moving three-dimensional object for each of the K-number of cameras when the determination result by the determination means is positive; and second control means for controlling the weighting of the K-number of cameras based on the amount of decrease calculated by the calculation means”.
[PTL 1] Japanese Patent Laid-open Publication No. 2009-289185
In the weighted combining method according to the technology described above, there is disclosed only a method involving choosing, for an image portion of an obstacle, the weighting of the image from one of the cameras between 0 and 1 in a binary manner. In the method, when separation processing of the image of the obstacle and a background image fails, a very unnatural combined image is produced. Therefore, there is a need to separate the image of the obstacle and the background image with a very high level of accuracy, and hence the processing amount demands and hardware capability demands are high.
It is an object of the present invention to enable the presence of a three-dimensional object to be easily and accurately detected, and reflected in an overhead view image.
This application includes a plurality of means for solving at least a part of the above-mentioned problem. Examples of those means include the following. In order to solve the above-mentioned problem, according to one embodiment of the present invention, there is provided an image display device, including: a feature quantity detection condition specifying unit configured to specify a condition for detecting a predetermined feature quantity for an overhead view image of each image obtained by photographing a region in common from at least two different viewpoints; a feature quantity detecting unit configured to detect, by using the specified feature quantity detection condition, the predetermined feature quantity for each of the overhead view images of the images obtained by photographing the region in common; a blending ratio specifying unit configured to specify, based on the predetermined feature quantity detected by the feature quantity detecting unit, a blending ratio to be used when blending pixels of the overhead view images of the images obtained by photographing the region in common from the at least two different viewpoints; and an overhead view image combining unit configured to produce and output a combined overhead view image by blending the pixels of the overhead view images of the images obtained by photographing the region in common based on the blending ratio specified by the blending ratio specifying unit.
According to the present invention, the presence of a three-dimensional object can be easily and accurately detected, and reflected in the overhead view image. Objects, configurations, and effects other than those described above become apparent from the following descriptions of embodiments of the present invention.
An example of an image display device 100 to which an embodiment of the present invention is applied, and an image display system 1 including the image display device 100, is now described with reference to the drawings.
The control unit 110 is configured to perform basic control of the image display device 100. For example, the control unit 110 is responsible for performing supervisory functions, e.g., overall power management of the image display device 100, and control and task management of various devices by an operating system. The control unit 110 includes a feature quantity detection condition specifying unit 111, a feature quantity detecting unit 112, a blending ratio specifying unit 113, and an overhead view image combining unit 114. The feature quantity detection condition specifying unit 111 is configured to specify a suitable condition in order to detect a feature quantity of an image. The feature quantity detection condition may be, for example, information for specifying in detail a scanning direction of an image in order to more accurately detect the feature quantity. The feature quantity detection condition is described in more detail later.
The feature quantity detecting unit 112 is configured to detect a predetermined feature quantity relating to the image. Specifically, for example, the feature quantity may be a ratio of the surface area of a three-dimensional object on the screen.
The blending ratio specifying unit 113 is configured to specify a weighting of the data among each of the images to be used when producing an overhead view image by combining a plurality of images obtained by photographing a region in common from different viewpoint positions. Specifically, the blending ratio specifying unit 113 is configured to specify a blending ratio based on, for example, whether or not a correlation can be seen in the feature quantity among each of the images, and whether or not the region in which the feature quantity can be seen in each of the images includes the same position in the region in common.
The overhead view image combining unit 114 is configured to output a combined overhead view image by blending the pixels of a plurality of images obtained by photographing a region in common based on the blending ratios specified by the blending ratio specifying unit 113.
The storage unit 120 includes a feature quantity detection condition storing unit 121 and a blend information storing unit 122.
The feature quantity detection condition storing unit 121 is configured to store a condition to be applied when detecting the feature quantity based on a combination of information for specifying the region to be photographed and a viewpoint position for photographing the region. The feature quantity detection condition storing unit 121 is described in more detail in the description of
The blend information storing unit 122 is configured to store the information for specifying the region to be photographed and information on the blending ratios among the viewpoint positions for photographing the region. The blend information storing unit 122 is described in more detail in the description of
The camera control unit 130 is configured to issue various control instructions, including instructions to start photographing and to finish photographing, to a camera capable of providing images to the image display device 100, and to acquire an image output from the camera. An outline of the configuration of the image display device 100 has been described above.
The image display system 1 includes the image display device 100, a camera group 101, and a display 108. The camera group 101 includes a plurality of cameras, from a first camera to an n-th camera (n is an integer). The image display system 1 is typically configured to photograph images of the vehicle surroundings by a plurality (n-number) of cameras mounted on the vehicle, combine the photographed images from each of the cameras by the image display device 100, and display an overhead view image of the surroundings of the vehicle by the display 108. The camera group 101 is not limited to cameras that are capable of mainly capturing visible light. For example, the camera group 101 may include cameras, such as night vision cameras, that are sensitive to infrared light and are configured to output the captured infrared light as an image.
The image display device 100 includes a decoding unit group 102 including one or a plurality of decoding units, a central processing unit (CPU) 104, a memory 105, an auxiliary storage device 106, and an encoding unit 107. Images transmitted from each of the cameras configuring the camera group 101 are decoded by the decoding unit group 102, which includes a decoding unit corresponding to each of the cameras in the camera group 101, and are then stored in the memory 105 via a bus 103. The photographed images that are from each of the cameras and stored in the memory 105 are combined by the CPU 104, and used to produce an overhead view image of the surroundings of the vehicle. The combined overhead view image is encoded by the encoding unit 107, and reproduced by the display 108.
The feature quantity detection condition specifying unit 111, the feature quantity detecting unit 112, the blending ratio specifying unit 113, the overhead view image combining unit 114, and the camera control unit 130 are realized by the CPU 104. Further, the feature quantity detection condition storing unit 121 and the blend information storing unit 122 are realized by the auxiliary storage device 106 and the memory 105.
In addition, the CPU 104 may be configured to produce images to be used to form a part of the overhead view images by using the images from each camera by performing, for example, correction processing of distortion generated by an optical system and perspective transformation processing on an image obtained by a camera having a field of view that is equal to or wider than a predetermined field of view. The CPU 104 is responsible for the processing for producing an overhead view image of the entire surroundings of the vehicle by performing processing such as cutting, combining, and alpha-blending of those overhead view images.
The CPU 104 is also configured to perform processing for detecting the presence of white lines drawn on the road, obstacles, pedestrians, and the like, and for detecting the size of the surface area of those objects shown in an image by performing various types of image processing on the photographed image data, such as edge extraction, contour extraction, Gaussian processing, noise removal processing, and threshold processing.
The encoding unit 107 is configured to encode the produced overhead view image. The display 108 is configured to display the overhead view image output from the image display device 100. The display 108 is, for example, a liquid crystal display (LCD). However, the display 108 is not limited to this, and may be some other type of display, such as a cathode ray tube (CRT), a liquid crystal on silicon (LCOS) display, an organic light-emitting diode (OLED) display, a holographic optical element, and a projector device. Further, the display 108 may be a flat monitor, a head-up display (HUD), a head-mounted display (HMD), and the like.
In this state, the image photographed by the front camera 201, which is mounted on the vehicle 200 facing in the forward direction that the vehicle travels, and converted into an overhead view image, is a front image 206, and the image photographed by the left-side camera 202 and converted into an overhead view image is a left-side image 207. The front camera 201 and the left-side camera 202 are mounted by tilting at a predetermined angle in the vertically downward direction so as to ensure a diagonally-downward (ground direction) field of view. It is to be understood that, although not shown, the image photographed by the rear camera 203 and converted into an overhead view image is produced as a rear image, and the image photographed by the right-side camera 204 and converted into an overhead view image is produced as a right-side image.
In this example, a leg portion of the object (pedestrian) 205 is included as a front leg image 205a in the front image 206 and as a left-side leg image 205b in the left-side image 207, respectively.
In general, in the processing for producing an overhead view image, correction processing of lens distortion occurring at an image edge portion and perspective transformation for changing the magnification ratio based on the depth distance are performed on the images photographed by the cameras. As a result, a three-dimensional object in the overhead view image is photographed as if the object has been stretched. In the front image 206 photographed by the front camera 201, the three-dimensional object is displayed extending in the direction of the arrow from the front camera 201 like the front leg image 205a. Similarly, in the left-side image 207 photographed by the left-side camera 202, the three-dimensional object is displayed extending in the direction of the arrow from the left-side camera 202 like the left-side leg image 205b. In other words, the object (pedestrian) 205, which originally is the same object, is displayed as the front leg image 205a and the left-side leg image 205b extending in different directions due to differences in the viewpoint positions of the cameras in the overhead view images. This phenomenon occurs due to the fact that the object (pedestrian) 205 is a three-dimensional object. When the object is not a three-dimensional object, for example, in the case of a flat pattern drawn on the road, such as a white line 208, the object is photographed without any differences in shape in the overhead view images, as shown by white lines 208a and 208b in the overhead view images. Further, the white lines 208a and 208b may be super imposed on each other by aligning their positions.
More specifically, when the same object is photographed from different directions and converted into overhead view images, and a feature indicating that the shape of the object extends in different directions is detected by comparing the two images, a three-dimensional object may be considered to be present. When a feature indicating that there is no difference in the shape of the object in the two images is detected, it may be determined that a roughly flat object is present on the road. Further, the direction in which the shape of the three-dimensional object extends in an overhead view image is determined based on the positional relationship between the camera and the three-dimensional object, as shown by the direction of the arrows extending from the front camera 201 and the left-side camera 202. Therefore, the direction in which the shape of the three-dimensional object extends in the overhead view images can be said to be an important determination condition for determining the presence of a three-dimensional object based on the detected images. In this embodiment, in view of this characteristic, a processing condition to be used when detecting a feature is decided based on the positional relationship between the camera, namely, the viewpoint position, and the three-dimensional object. As a result, it can be said that the extraction accuracy of the three-dimensional object is improved, and that blend processing can be performed that enables an overlapping portion of a plurality of camera images to be seen more easily.
In this case, the front left area 300 is an area in which the images obtained by the front camera 201 and the left-side camera 202 overlap (in the following description, a region photographed in common in such a manner by a plurality of cameras is referred to as an “overlapping area”). Similarly, the front right area 302, the rear left area 305, and the rear right area 307 can also be said to be an overlapping area photographed in common by a plurality of cameras. In
Considering the point that, as described above, a three-dimensional object is represented in an overhead view image extending in a direction that is based on the viewpoint direction, when the ground is flat with no protrusions (three-dimensional objects), it can be said that the images of the area in common are basically identical, and hence can be superimposed on each other.
The white lines 208a and 208b in the overhead view images of
Specifically, the three-dimensional objects (front leg image 205a and left-side leg image 205b) shown in the front three-dimensional object image 401a and the left-side three-dimensional object image 401b, respectively, can be kept as a difference, and the objects 208a and 208b on the road can be removed as portions in common.
In order to accurately extract the three-dimensional object, the feature quantity detection condition specifying unit 111 is configured to specify a suitable detection condition. More specifically, due to the above-mentioned characteristic, the direction in which a contour of the three-dimensional object extends is based on the direction of the viewpoint position as seen from the overlapping area, and hence it can be said that a detection condition for efficiently increasing extraction accuracy is a detection condition that specifies a suitable contour scanning direction. Therefore, the feature quantity detection condition specifying unit 111 is configured to specify the feature quantity detection condition based on a geometric relationship between the viewpoint position and the region in common. Specifically, the feature quantity detection condition specifying unit 111 is configured to specify the contour scanning direction to be used in feature quantity detection based on the viewpoint position and the direction of the viewpoint position as seen from the region in common.
In general, contour extraction processing is performed by scanning a change amount of elements forming the image, such as brightness, the values of red, green, and blue (RGB), or the values of cyan, magenta, and yellow (CMY), in a predetermined direction (usually, the horizontal pixel direction) of the image. When the scanning direction and the detected object are in an orthogonal state, a high detection accuracy is often obtained. In view of this, the feature quantity detection condition specifying unit 111 is configured to set the detection condition in order to scan in an orthogonal manner to the extension direction of the contour of the three-dimensional object.
Further, the feature quantity detection condition specifying unit 111 is configured to specify a contour scanning direction 402a for the front three-dimensional object image 401a and a contour scanning direction 402b for the left-side three-dimensional object image 401b.
The feature quantity detecting unit 112 is configured to detect contours 403a and 403b by scanning the front three-dimensional object image 401a based on the specified contour scanning direction 402a, and the left-side three-dimensional object image 401b based on the specified contour scanning direction 402b. An outline of the detection processing of the feature quantity has been described above.
The viewpoint position specifying information 121C is information for specifying the viewpoint position, namely, the position of the camera. The feature quantity detection condition 121D is a condition to be used in order to detect the feature quantity. For example, a condition indicating that an image rotation angle is to be applied as θ (θ is a difference in the angle between the extension direction of the straight line orthogonal to a line segment from the viewpoint position to the representative point of the region and the direction for scanning the change amount of brightness in contour extraction processing) is stored in advance in the feature quantity detection condition 121D.
Strictly speaking, the direction on the image when a three-dimensional object is photographed so as to extend in the front image 400a depends on the position in the area. However, the calculation processing load may be reduced by assuming this direction to be the same and setting the contour scanning direction to be the same.
In this processing, the feature quantity detection condition specifying unit 111 is configured to rotate the front three-dimensional object image 401a by the image rotation angle θ, which is shown in the feature quantity detection condition 121D, about the weighted center 500, which is the representative point of the front three-dimensional object image 401a, as the center of rotation. However, the feature quantity detection condition specifying unit 111 is not limited to this. The feature quantity detection condition specifying unit 111 may also be configured to rotate the processing image itself by an angle decided based on the positional relationship between the camera position and the photographed area, and then detect edges in common.
As described above, the image rotation angle θ is the difference in the angle between the extension direction of the straight line orthogonal to a line segment from the viewpoint position to the representative point of the region and the direction for scanning the change amount in brightness in the contour extraction processing. As a result, the direction (horizontal or vertical) for scanning the change amount in brightness and the contour scanning direction 501 can be made parallel, which allows a high accuracy to be obtained for the contour extraction. The rotation amount may be set to an optimum angle for each camera or for each photography direction. For example, in the case of a rear camera, when scanning in the vertical direction is suitable (e.g., positioning when parking a vehicle in a garage etc.), the image rotation angle θ may be determined so that the contour scanning direction is the vertical direction.
Specifically, in the case of performing the setting based on a front three-dimensional object image 401a obtained by converting an overlapping area photographed by the front camera 201 and converted into an overhead view image, in
The direction in which the three-dimensional object extends is, basically, the extension direction of the line segment connecting the camera and the position of the three-dimensional object in contact with the ground. In the method illustrated in
The method of setting the contour scanning direction is not limited to the examples illustrated in
First, based on a positional relationship between the camera 1 and the overlapping area, the feature quantity detection condition specifying unit 111 decides a processing condition C1 of the overhead view image obtained based on the camera 1 (Step S001). Specifically, the feature quantity detection condition specifying unit 111 refers to the feature quantity detection condition storing unit 121, and reads the feature quantity detection condition 121D that matches the combination of the region specifying information 121A corresponding to the overlapping area and the viewpoint position specifying information 121C corresponding to the mounted position of the camera 1.
Then, based on a positional relationship between the camera 2 and the overlapping area, the feature quantity detection condition specifying unit 111 decides a processing condition C2 of the overhead view image obtained based on the camera 2 (Step S002). Specifically, the feature quantity detection condition specifying unit 111 refers to the feature quantity detection condition storing unit 121, and reads the feature quantity detection condition 121D that matches the combination of the region specifying information 121A corresponding to the overlapping area and the viewpoint position specifying information 121C corresponding to the mounted position of the camera 2.
Then, the feature quantity detecting unit 112 uses the processing condition C1 to detect a three-dimensional object present in the overlapping area of the overhead view image obtained based on the camera 1 (Step S003). Note that, the detected three-dimensional object has an image feature quantity Q. Specifically, the feature quantity detecting unit 112 specifies the contour of the three-dimensional object by applying the processing condition C1 on the overhead view image obtained based on the camera 1, and scanning in the contour scanning direction under a state satisfying the processing condition C1. During this processing, the feature quantity detecting unit 112 extracts the image feature quantity by performing, for example, contour extraction using a portion having many edges and the like, a Laplacian filter, or a Sobel filter, binarization processing, or various types of pattern recognition processing using color information, histogram information, and the like. Further, the feature quantity detecting unit 112 specifies the image feature quantity Q, which may be a position of pixels from which an edge or a contour was successfully extracted, a brightness level of the edge, and the like.
Then, the feature quantity detecting unit 112 uses the processing condition C2 to detect a three-dimensional object present in the overlapping area of the overhead view image obtained based on the camera 2 (Step S004). Note that, the detected three-dimensional object has an image feature quantity Q2. Specifically, the feature quantity detecting unit 112 specifies the contour of the three-dimensional object by applying the processing condition C2 on the overhead view image obtained based on the camera 2, and scanning in the contour scanning direction under a state satisfying the processing condition C2. During this processing, the feature quantity detecting unit 112 extracts the image feature quantity by performing, for example, contour extraction using a portion having many edges and the like, a Laplacian filter, or a Sobel filter, binarization processing, or various types of pattern recognition processing using color information, histogram information, and the like. Further, the feature quantity detecting unit 112 specifies the image feature quantity Q2, which may be a position of pixels from which an edge or a contour was successfully extracted, a brightness level of the edge, and the like.
For any one of the image feature quantities Q1 and Q2, an image feature quantity obtained by a scale-invariant feature transform (SIFT), a histogram of oriented gradients (HOG), or the like, may be utilized. Further, a selection may be made regarding whether feature information that was successfully extracted by combining a HOG feature quantity and a feature quantity of the shape of the pedestrian is information on a person, such as a pedestrian, or on an inanimate object. Thus, information that is more useful can be presented to a driver by switching contrast enhancement processing or the output content, such as a danger level indication, based on whether or not the object is a pedestrian or an inanimate object.
Next, the blending ratio specifying unit 113 determines whether or not the image feature quantities Q1 and Q2 have a correlation equal to or stronger than a predetermined level (Step S005). Specifically, the blending ratio specifying unit 113 determines whether or not the pixel positions of the detected object match or are gathered in a given range, and whether or not a feature quantity difference is within a predetermined range. This processing may be performed by determining a correlation of a spatial distance relationship or a semantic distance relationship by performing hitherto-existing statistical processing or clustering processing.
When the correlation is equal to or stronger than the predetermined level (“Yes” in Step S005), the blending ratio specifying unit 113 decides that a three-dimensional object is not present in the overlapping area, and hence the overhead view images of the overlapping area are to be combined at the predetermined blending ratios by using the overhead view image obtained based on the camera 1 and the overhead view image obtained based on the camera 2. The blending ratio specifying unit 113 then causes the overhead view image combining unit 114 to combine the overhead view images of the overlapping area (Step S006). In this case, when combining so that the blending ratio for any one of the overhead view images is “0”, this essentially enables the overhead view image obtained based on any one of the camera 1 and the camera 2 to be selected and used. However, when a three-dimensional object is present near a joint of the overhead view images, the image of the three-dimensional object may disappear. As a result, rather than producing the overhead view image by selectively utilizing any one of the overhead view images, it is preferred that the overhead view images be blended and combined based on a predetermined “non-zero” blending ratio.
Further, the overhead view image combining unit 114 weights, based on the blending ratios, information (e.g., brightness information or RGB information) on pixels at positions corresponding to the overhead view image obtained based on the camera 1 and the overhead view image obtained based on the camera 2, and combines the pixel information into one overhead view image. When the combined overhead view image has been produced, the overhead view image combining unit 114 finishes the blending ratio decision processing. The combined overhead view image is then output by transmitting the image to the encoding unit 107 and the display 108.
When the correlation is not equal to or stronger than the predetermined level (“No” in Step S005), the blending ratio specifying unit 113 specifies the positions in the overlapping area in which the three-dimensional object included in the overhead view image obtained based on the camera 1 and the three-dimensional object included in the overhead view image obtained based on the camera 2 are present, and determines whether or not those three-dimensional objects are at positions that are in common by a predetermined level or more (Step S007). In other words, the blending ratio specifying unit 113 determines whether or not, in the region in common, there is a region in which the feature quantity of the image obtained from each camera overlaps by a predetermined degree or more.
When the three-dimensional objects are in a position in common (“Yes” in Step S007), the blending ratio specifying unit 113 decides the blending ratios based on the image feature quantities Q1 and Q2 (Step S008). Specifically, the blending ratio specifying unit 113, first, performs a predetermined operation on the image feature quantity Q obtained based on the camera 1, and the result of the operation is represented by F(Q1). Similarly, a result obtained by performing a predetermined operation on the image feature quantity Q2 obtained based on the camera 2 is represented by F(Q2). Further, based on Expression (1), the blending ratio specifying unit 113 specifies a combining weighting ratio that is based on the image feature quantity Q1 obtained based on the camera 1.
Combining weighting P1=F(Q1)/(F(Q1)+F(Q2)) Expression (1)
Similarly, based on Expression (2), the blending ratio specifying unit 113 specifies a combining weighting ratio that is based on the image feature quantity Q2 obtained based on the camera 2.
Combining weighting P2=F(Q2)/(F(Q1)+F(Q2)) Expression (2)
The above-mentioned predetermined operator F may be an operator for extracting and counting, in the overlapping area, the number of pixels of an image having a feature quantity that is equal to or more than a predetermined threshold. In this case, the size of each of the images of the three-dimensional object in the overlapping area of the overhead view image obtained based on the camera 1 and the overhead view image obtained based on the camera 2 may be used as an element for varying the blending ratio.
Further, the predetermined operator F may also be an operator for calculating a sum, an average, a weighted average, a weighted center, a center value, and the like, of the image feature quantity of the pixels in the overlapping area of the overhead view image obtained based on the camera 1 and the overhead view image obtained based on the camera 2. In this case, not only the size of the image of the three-dimensional objects in the overlapping area, but the magnitude of the value of the feature quantity may also be used as an element for varying the blending ratio.
The blending ratio may also be decided for each pixel. In this case, a feature quantity per se of a relevant pixel may be used as F(Q1), and a feature quantity per se of a relevant pixel may be used as F (Q2). The blending ratio may also be decided by comparing F(Q1) and F(Q2) for each pixel, and setting so that the image having the larger value has a larger blending ratio.
Further, for example, even when the ratio of the blending ratio feature quantity of the overhead view image obtained based on the camera 1 is continuously changing, for the portion in which the “ratio of the feature quantity” is closer to 0.5, the gradient of the change in the blending ratio may be set to be larger. Calculating the blending ratio in this manner enables the contrast of an image that stands out more (an image in which there is a high likelihood of a three-dimensional object being present) to be enhanced, while also allowing the blending ratio to be switched gently when the “ratio of the feature quantity” changes. As a result, there is an effect that an image in which there is a comparatively high likelihood of a three-dimensional object being present can be recognized by the user more easily.
In addition, for example, even when the ratio of the blending ratio feature quantity of the overhead view image obtained based on the camera 1 is continuously changing, when the ratio of the feature quantity has increased to a predetermined level or more or has decreased to a predetermined level or less, the blending ratio of the overhead view image having the larger feature quantity may be set to 1, and the blending ratio of the other overhead view image may be set to 0. Calculating the blending ratio in this manner enables the contrast of an image that stands out more (an image in which there is a high likelihood of a three-dimensional object being present) to be further enhanced, while also allowing the blending ratio to be switched gently when the ratio of the feature quantity changes. As a result, there is an effect that an image in which there is a comparatively high likelihood of a three-dimensional object being present can be recognized by the user still more easily.
Still further, when the ratio of the feature quantity changes, the blending ratio may be set to be switched in steps. In this case, the switch in the blending ratio becomes gentler as the number of switching steps increases. Thus, even a case in which the change in the blending ratio with respect to the change in the ratio of the feature quantity is not continuous, such as when the blending ratio is switched in steps based on a change in the ratio of the feature quantity, may be an embodiment of the present invention.
Note that, regarding the operator F, a case has been described in which the value of the operation result increases for images in which there is a high likelihood that a three-dimensional object is present. However, the opposite may also be performed, that is, the operator F may be an operator for which the value of the operation result decreases for images in which there is a high likelihood that a three-dimensional object is present.
Thus, even for an image of a three-dimensional object portion, multiple values may be used as the blending ratio. As a result, even the three-dimensional object portion may be combined more naturally based on the likelihood that a three-dimensional object is present.
Further, because the blending ratio may be calculated for a whole overlapping area or for pixel units, and the blending ratio may be used in combining processing of the whole overlapping area or of pixel units, the occurrence of unnatural image joints, such as a boundary line, in the overlapping area can be avoided. As a result, a more natural combined image can be produced.
In addition, the blending ratio may be decided based on another method. In this another method, the distances from a pixel position in the front left area 300 to each of the front camera 201, which is the “camera 1”, and the left-side camera 202, which is the “camera 2”, are respectively represented by d1 and d2, and a fixed blending ratio is set based on the ratio between the distance d1 and the distance d2. In other words, the blending ratio of the image from the front camera 201 may be set larger for a pixel position that is a closer distance to the front camera 201 (i.e., d1<d2), which is the “camera 1”. For example, the blending ratio of the image from the front camera 201 may be decided based on the expression “P1=d2/(d1+d2)”, and the blending ratio of the image from the left-side camera 202 may be decided based on the expression “P2=d1/(d1+d2)”
However, in this case, because there is a high likelihood of increased image blur and distortion at pixel positions that are too close to the camera, it is preferred that the blending ratio for pixel positions that are too close by a predetermined amount or more be corrected so as to increase the weighting of the overhead view image photographed by the camera that is more further away. In other words, when an approach limit threshold is represented by dth (d1 minimum value≦dth≦d1 maximum value), for positions in which d1<d2 and d1<dth, the blending ratio P1 of the overhead view image based on the closer front camera 201 may be corrected so as to be lower. For example, substituting the blending ratios P1 and P2 set as described above, the blending ratio of the image from the front camera 201 may be decided based on the expression “P1=d1/(d1+d2)”. Then, the blending ratio of the image from the left-side camera 202 may be decided based on the expression “P2=d2/(d1+d2)”. As a result, an overhead view image having reduced image blur and distortion, which occur at positions too close to the camera, may be displayed.
Next, the overhead view image combining unit 114 performs overhead view image combining including representations to be emphasized, such as highlighting the presence of a three-dimensional object, by using the blending ratios (Step S009). Specifically, the overhead view image combining unit 114 weights, based on the decided blending ratios, information (e.g., brightness information or RGB information) on the pixels at positions corresponding to the overhead view image obtained based on the camera 1 and the overhead view image obtained based on the camera 2, and combines the pixel information into one overhead view image. When the combined overhead view image has been produced, the overhead view image combining unit 114 finishes the blending ratio decision processing. The combined overhead view image is then output by transmitting the image to the display 108.
In this example, the pedestrian leg 1103 and the pedestrian leg 1104 each have a feature quantity in a position 1108 in common. Therefore, the blending ratio “p: (1−p)” between the overhead view image 1101 obtained by the camera 1 and the overhead view image 1102 obtained by the camera 2 is calculated, and based on the calculated blending ratio, the overhead view image combining unit 114 produces a combined overhead view image 1105. As a result, a pedestrian leg 1106 photographed by the camera 1 and a pedestrian leg 1107 photographed by the camera 2 are combined in accordance with their respective blending ratios, and included in combined overhead view image 1105.
Returning to the description of the processing flow, when it is determined that a three-dimensional object is not present in a position in common (“No” in Step S007), the blending ratio specifying unit 113 decides that the image having the larger feature quantity among the image feature quantities Q1 and Q2 is to be employed for the overhead view image (Step S010).
Next, the overhead view image combining unit 114 performs overhead view image combining by using the employed overhead view image (Step S010). Specifically, the overhead view image combining unit 114 produces a combined overhead view image by employing, of the overhead view image obtained based on the camera 1 and the overhead view image obtained based on the camera 2, the image having the larger feature quantity in the overlapping area. When the overhead view image has been produced, the overhead view image combining unit 114 finishes the blending ratio decision processing. The combined overhead view image is then output by transmitting the image to the display 108. Note that, in order to avoid an image near a joint from disappearing due to an erroneous detection, the combined overhead view image produced may be by performing the blend processing by prioritizing the blending ratio of a camera image from which a feature can be extracted.
In this example, the overhead view image combining unit 114 produces a combined overhead view image 1205 by employing the overhead view image 1201 photographed by the camera 1. As a result, the pedestrian leg 1203 photographed by the camera 1 is included in the combined overhead view image 1205.
The processing content of the blending ratio decision processing has been described above. Based on the blending ratio decision processing, a combined overhead view image including a region in common can be produced by applying a feature quantity detection condition on a plurality of pieces of image information, each of the plurality of pieces of image information partially having an image obtained by photographing a region in common from a different viewpoint position, to detect a feature quantity of the region in common, and using the feature quantity of the region in common of each image to specify a weighting for blending an image included in the region in common. In other words, in an overlapping area photographed by a plurality of cameras, a flat pattern drawn on a road and a three-dimensional object can be differentiated by extracting image feature quantities of camera images photographed from different directions, and determining a correlation among the extracted image feature quantities. When a three-dimensional object is present, whether or not the three-dimensional object is present in the overlapping area or is present outside of the overlapping area can be determined by determining a positional overlap of the feature quantities. Further, the blending ratio when overhead view images are combined may be varied in accordance with each of those states, thereby allowing a good overhead view image to be obtained.
The first embodiment has been described above with reference to the drawings. According to the first embodiment, it can be said that the image display device 100 is capable of producing an overhead view image of the entire surroundings of a vehicle by utilizing images photographed by a plurality of cameras to detect obstacles and pedestrians, and capable of, based on the detection results, producing the combined overhead view image of each camera image that the obstacles and pedestrians may easily be shown in the images. In other words, the image display system 1 includes a plurality of image pickup devices each configured to obtain image information partially including an image obtained by photographing a region in common from a different viewpoint position, and an image display device.
The image display device includes a feature quantity detection condition specifying unit configured to specify a feature quantity detection condition to be used as a condition for detecting a predetermined feature quantity relating to image information, a feature quantity detecting unit configured to detect the feature quantity of a region in common by applying the feature quantity detection condition on a plurality of pieces of image information, a blending ratio specifying unit configured to specify a weighting for blending images including the region in common by using the feature quantity of the region in common of each image, and an overhead view image combining unit configured to combine the overhead view images including the region in common by using a blending ratio.
The present invention is not limited to the embodiment described above. The present invention includes various modified examples. For example, the embodiment described above is described in detail in order to facilitate an understanding of the present invention. However, the present invention does not need to include all of the configurations described above. Further, a part of the configurations of a given embodiment may be replaced with the configurations of another embodiment. In addition, the configurations of another embodiment may be added to the configurations of a given embodiment. Still further, other configurations may be added to, deleted from, or replace a part of the configurations of each embodiment.
The image display system 1 according to the present embodiment includes the image display device 100, the camera group 101, and the display 108. However, one or both of the camera group 101 and the display 108 may be configured so as to not be directly managed by the image display system 1. For example, the present invention may be applied in a case in which an overhead view image of a region to be monitored is produced by combining images acquired and transmitted by a plurality of monitoring cameras mounted on positions that are not limited to vehicles (e.g., an exhibit in an art gallery).
In the first embodiment described above, combining is performed by comparing the feature quantities of a plurality of images obtained by photographing a region in common with each other to decide a blending ratio. However, the present invention is not limited to this. For example, in consideration of hysteresis over time, the blending ratio may be gradually changed over time so as to avoid large changes in the blending ratio compared with the previous and subsequent time points.
At a time point t1, the blending ratio for a pedestrian leg 1301 photographed by the camera 1 and the blending ratio for a pedestrian leg 1302 photographed by the camera 2 are decided based on the image feature quantity (e.g., the shown surface area of the legs), and the images are combined by using, for example, P1=0.9 for the image of the pedestrian photographed by the camera 1 and P2=0.1 for the image of the pedestrian photographed by the camera 2. Setting the blending ratios in this manner enables the image shown as having a larger surface area of the pedestrian leg, namely, the leg 1301 photographed by the camera 1, to be crisply displayed.
At a time point t2, the image feature quantity (surface area) of a pedestrian leg 1303 photographed by the camera 1 and the image feature quantity (surface area) of a pedestrian leg 1304 photographed by the camera 2 are about the same, and hence combining is performed by using blending ratios that are about the same, namely, P1=P2=0.5, or P1=0.6 and P2=0.4, for example.
At a time point t3, the image feature quantity of a pedestrian leg 1306 photographed by the camera 2 is slightly more than the image feature quantity of a pedestrian leg 1305 photographed by the camera 1, and hence combining is performed by using a blending ratio of P1=0.3 for the camera 1 image and a blending ratio of P2=0.7 for the camera 2 image.
At a time point t4, the image feature quantity of a pedestrian leg 1308 photographed by the camera 2 is substantially more than the image feature quantity of a pedestrian leg 1307 photographed by the camera 1, and hence combining is performed by using a blending ratio of P1=0.1 for the camera 1 image and a blending ratio of P2=0.9 for the camera 2 image. As a result, the leg 1308 photographed by the camera 2, which is shown as having the larger leg surface area, is crisply displayed.
Thus, based on the invention according to the first embodiment, when the same object is photographed by a plurality of cameras, an image that has a higher contrast for the image shown as having a larger surface area is produced by setting the blending ratios based on a relative ratio of the image feature quantities. In addition, in the processing for deciding the blending ratios, the blending ratio specifying unit 113 may be configured to decide the blending ratios by applying Expression (3).
Blending ratio p1(t)=p1(t−1)+k(p1_calc(t)−p1(t−1)) Expression (3)
In other words, a blending ratio p1(t) of the camera 1 at a time point t can be set by adding k-times (k is a number of from 0 to 1) a difference with the blending ratio at a time point (t−1) to the blending ratio at the time point (t−1). In Expression (3), the value of p1_calc(t) is the before-correction blending ratio at the time t calculated based on the feature quantity. More specifically, a blend weighting may be specified for each predetermined period, and weighting may be performed so that a change amount between the blend weightings of a period before or a period after, or the periods before and after, a predetermined period is a predetermined value or less during the blend weighting of each of those predetermined periods.
The blending ratio may also be decided by predicting the brightness at a future time point, and setting so that the blending ratio is a smooth continuum until the predicted brightness.
Note that, in the first embodiment, at the time point t2, when the image feature quantities are about the same between the images, the blending ratios should be set to be the same, namely, P1=P2=0.5. However, in such a case, there is a possibility that the brightness of the image obtained by combining the two images increases, causing the combined image to be less visible. Therefore, in consideration of hysteresis, the display processing may be performed by prioritizing an image whose blending ratio one time point before was larger. Specifically, in the example illustrated in
In addition, in the case of image information on moving images photographed over a predetermined period, the present invention may also be employed for a method of calculating blending ratios by using motion vectors. In other words, motion vector information on an optical flow is utilized in order to detect image feature quantities, and the blending ratios of the overlapping area are calculated based on the detected image feature quantities to combine the images. The blending ratios are calculated based on the ratio of the sum of the motion vectors by utilizing the motion vectors of a plurality of frames as the feature quantities. Specifically, a sum ΣCam 1 of the motion vectors in the image from the camera 1 and a sum ΣCam 2 of the motion vectors 1404 in the image from the camera 2 are calculated. The blending ratio P1 of the camera 1 and the blending ratio P2 of the camera 2 are calculated based on Expressions (4) and (5) from the calculated ΣCam 1 and ΣCam 2.
P1=ΣCam1/(ΣCam1+ΣCam2) Expression (4)
P2=ΣCam2/(ΣCam1+ΣCam2) Expression (5)
In other words, a larger blending ratio is set for a camera image having greater movement. A combined image 1405 including a moving object is produced based on those blending ratios. Based on this method, images with larger movements in the overlapping area can be produced that are crisper and have better contrast.
In
For example, for the first region 300A, the weighted center position is close to the front camera 201 side, and hence the blending ratio of the image from the front camera 201 is P1=0.9, and the blending ratio of the image from the left-side camera 202 is P2=0.1. On the other hand, for the adjacent second region 300B, because the weighted center position is a little further away from the front camera 201, P1=0.8 and P2=0.2. Similarly, the blending ratios for the third to sixth regions are set based on the distance from the front camera 201 and the distance from the left-side camera 202. For the seventh region 300G, because the weighted center position is close to the left-side camera 202, P1=0.1 and P2=0.9. Thus, the blending ratios are set by prioritizing the image from the front camera 201 as the weighted center position is closer to the front camera 201, and prioritizing the image from the left-side camera 202 as the weighted center position is closer to the left-side camera 202. As a result, because for each divided region the images are blended by emphasizing the image from the closer camera, images can be produced that are easier to see. In addition, in each divided region, the blending ratio may be adjusted based on the feature quantity of each camera image.
A part or all of each of the configurations, functions, processing units, processing means, and the like described above may be realized by software for causing a processor to interpret and execute a program for realizing each of those functions. Information on the programs, tables, files, and the like for realizing each function may be stored in a storage device, such as a memory, a hard disk, and a solid-state drive (SSD), or a storage medium, such as an integrated chip (IC) card, a secure digital (SD) card, and a digital versatile disc (DVD).
Further, the control lines and information lines considered to be necessary for the description are illustrated. It is not necessarily the case that all the control lines and information lines necessary for a product are illustrated. In actual practice, almost all the configurations may be considered as being connected to each other.
Further, a part or all of each of the above-mentioned configurations, functions, processing units, and the like may be realized by hardware by, for example, designing those as an integrated circuit. In addition, the technical elements of the above-mentioned embodiments may be applied independently, or may be applied by dividing those elements into a plurality of parts, such as a program portion and a hardware portion.
The present invention has been described above mainly by way of embodiments.
Number | Date | Country | Kind |
---|---|---|---|
2014-066268 | Mar 2014 | JP | national |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/JP2015/050892 | 1/15/2015 | WO | 00 |