The present invention relates to the field of imaging. More specifically, the present invention relates to improved autofocusing.
A variety of techniques for generating depth maps and autofocusing on objects have been implemented in the past. One method conventionally used in autofocusing devices, such as video cameras, is referred to a hill-climbing method. The method performs focusing by extracting a high-frequency component from a video signal obtained by an image sensing device such as a CCD and driving a taking lens such that the mountain-like characteristic curve of this high-frequency component is a maximum. In another method of autofocusing, the detected intensity of blur width (the width of an edge portion of the object) of a video signal is extracted by a differentiation circuit.
A wide range of optical distance finding apparatus and processes are known. Such apparatus and processes may be characterized as cameras which record distance information that are often referred to as depth maps of three-dimensional spatial scenes. Some conventional two-dimensional range finding cameras record the brightness of objects illuminated by incident or reflected light. The range finding cameras record images and analyze the brightness of the two-dimensional image to determine its distance from the camera. These cameras and methods have significant drawbacks as they require controlled lighting conditions and high light intensity discrimination.
Another method involves measuring the error in focus, the focal gradient, and employs the focal gradient to estimate the depth. Such a method is disclosed in the paper entitled “A New Sense for Depth Field” by Alex P. Pentland published in the Proceedings of the International Joint Conference on Artificial Intelligence, August, 1985 and revised and republished without substantive change in July 1987 in IEEE Transactions on Pattern Analysis and Machine Intelligence, Volume PAMI-9, No. 4. Pentland discusses a method of depth-map recovery which uses a single image of a scene, containing edges that are step discontinuities in the focused image. This method requires the knowledge of the location of these edges, and this method cannot be used if there are no perfect step edges in the scene.
Other methods of determining distance are based on computing the Fourier transforms of two or more recorded images and then computing the ratio of these two Fourier transforms. Computing the two-dimensional Fourier transforms of recorded images is computationally very expensive which involves complex and costly hardware.
Other methods have been implemented by comparing multiple images to determine a depth. One method includes using an image that is in-focus and an image that is out-of-focus where the in-focus value is zero, hence the mathematics are very simple. Another method utilizes two separate images, with different focuses, where the distance between the images is the blur of the first image minus the blur of the second image. Computations are performed to determine the depth, although there are significant drawbacks with the past methods implemented in these computations.
A method of and an apparatus for determining a depth map utilizing a movable lens and an image sensor are described herein. The depth information is acquired by moving the lens a short distance and acquiring multiple images with different blur quantities. An improved method of simulating gaussian blur and an approximation equation that relates a known gaussian blur quantity to a known pillbox quantity are used in conjunction with a non-linear blur difference equation. The blur quantity difference is able to be calculated and then used to determine the depth map. Many applications are possible using the method and system described herein, such as autofocusing, surveillance, robot/computer vision, autonomous vehicle navigation, multi-dimensional imaging and data compression.
In one aspect, an apparatus for determining depth information comprises a lens for acquiring an image signal, an image sensor for receiving the image signal, wherein one or more of the lens and the image sensor are movable in relation to each other and a processing module coupled to the image sensor for determining depth information by using a first blur quantity and a second blur quantity to determine a non-linear blur difference. The lens, the image sensor and the processing module are contained within an imaging device. The imaging device is selected from the group consisting of a camera, a video camera, a camcorder, a digital camera, a cell phone and a PDA. The lens is movable and acquires the image signal at a first position and then at a second position. The processing module simulates gaussian blur. The processing module simulates gaussian blur using iterations of a convolution kernel. The processing module relates a gaussian blur quantity to a pillbox blur quantity.
In another aspect, a method of determining depth information comprises receiving a first image signal with a first blur quantity at an image sensor after the first image signal passes through a lens at a first position, receiving a second image signal with a second blur quantity at the image sensor after the second image signal passes through the lens at a second position, computing a non-linear blur difference using the first blur quantity and the second blur quantity and generating a depth map from the non-linear blur difference. The method is performed within an imaging device. The imaging device is selected from the group consisting of a camera, a video camera, a camcorder, a digital camera, a cell phone and a PDA. The method further comprises simulating gaussian blur. Simulating gaussian blur uses iterations of a convolution kernel. The method further comprises relating a gaussian blur quantity to a pillbox blur quantity.
In another aspect, a method calculating depth information comprises simulating a gaussian blur quantity using a convolution kernel, relating the gaussian blur quantity to a pillbox blur quantity, computing a non-linear blur difference and generating a depth map from the non-linear blur difference. The method is performed within an imaging device. The imaging device is selected from the group consisting of a camera, a video camera, a camcorder, a digital camera, a cell phone and a PDA. Simulating gaussian blur uses iterations of a convolution kernel.
In yet another aspect, a capture and display device comprises a lens for receiving an image signal, an image sensor for receiving the received image signal, wherein one or more of the lens and the image sensor are movable in relation to each other and a program coupled to the image sensor for determining depth information by using a first blur quantity and a second blur quantity to determine a non-linear blur difference. The lens, the image sensor and the program are contained within an imaging device. The imaging device is selected from the group consisting of a camera, a video camera, a camcorder, a digital camera, a cell phone and a PDA. The lens is movable and acquires the image signal at a first position and then at a second position. The program simulates gaussian blur. The program simulates gaussian blur using iterations of a convolution kernel. The program relates a gaussian blur quantity to a pillbox blur quantity.
In another aspect, an apparatus for determining depth information comprises a lens for receiving an image signal, an image sensor for receiving the received image signal, wherein one or more of the lens and the image sensor are movable in relation to each other and a processing module coupled to the image sensor for determining depth information by using a first blur quantity and a second blur quantity to determine a non-linear blur difference, wherein a gaussian blur is simulated. The lens, the image sensor and the processing module are contained within an imaging device. The imaging device is selected from the group consisting of a camera, a video camera, a camcorder, a digital camera, a cell phone and a PDA. The image sensor is movable and acquires the image signal at a first position and then at a second position. The processing module simulates the gaussian blur using iterations of a convolution kernel. The processing module relates the gaussian blur quantity to a pillbox blur quantity.
A method of and an apparatus for determining a depth map are described herein. The depth information is acquired by moving a lens a short distance and acquiring multiple pictures with different blur quantities or functions. A more accurate gaussian blur implementation, in addition to a non-linear version of the blur difference allow a better determination of the depth map. By telescoping between the two blurred functions using the gaussian blur implementation, a depth map is determined. In the ideal optical scenario, the two functions are pillbox blur functions.
The accuracy of determining the gaussian blur is able to be improved by using the following method based on differential equations.
In one dimension:
f: R→R
L: R×R+→R
L(x;0)=f(x)
g: R×R+\{0}→R
where f is the one dimensional signal and L is defined by L(x;0)=f(x) and convolution with g, the gaussian blur kernel. Then:
and is defined by the diffusion equation:
Using Euler's method, the continuous equation:
can be discretized and rewritten in the following form:
For example, if Δt=½, it is easy to realize the implementation above using the following convolution kernel:
[¼ ½ ¼]
Although the example where Δt=½ is used above and below for clarity, it should be understood that values for Δt less than ½ such as ¼ are able to be used as well in one dimension. For instance, if Δt=¼, then the kernel obtained is [⅛ ¾ ⅛]. By reducing the Δt value from ½ to ¼, the number of iterations needed to have the same effect increases. The equations below utilize the assumption that Δt=½ for one dimension. The Central Limit Theorem shows that the repeated iteration of the kernel quickly yields a gaussian function:
[¼ ½ ¼]*[¼ ½ ¼]* . . . *[¼ ½ ¼]
Equation (8) is the old, less accurate version of determining the blur.
Equation (9) is an improved version. Using the number of iterations of [¼ ½ ¼] needed to be applied to telescope between and σstart 2 and σfinish2, the σ?2 is able to be calculated accurately.
Using a similar numerical analysis approach shown in the ID case, we obtain the following convolution kernel for the 2D case for Δt=¼. It should be understood that values for Δt less than ¼ such as ⅛ are able to be used as well in two dimensions. By reducing the Δt value from ¼ to ⅛, the number of iterations needed to have the same effect increases.
Fast implementation of the convolution kernel is possible using vector shifts, multiplication and addition.
In one dimension for example:
h=[¼ ½ ¼]
data=[a1 a2 a3 a4 a5 a6 a7 a8 a9 a10]
result=data*h, such that:
result=¼*[a2 a3 a4 a5 a6 a7 a8 a9 a10 a1]+½*[a1 a2 a3 a4 a5 a6 a7 a8 a9 a10]+¼*[a10 a1 a2 a3 a4 a5 a6 a7 a8 a9]
Furthermore, the one dimensional version is easily generalized to two dimensions. The following relationship exists for the one dimensional and two dimensional kernels:
Thus, a unique relationship exists for the one dimensional and two dimensional cases.
One Dimension:
From
Assuming r1 is larger than r2, the following positive valued quantity is able to be computed:
Equation (12) is a non-linear version of the blur difference. Re-arranging equation (12) results in the following equation:
Then, using the quadratic formula, the depth d0 is able to be determined:
ax2+bx+c=0
a=[−(−(r1)2+(r2)2)(2fnum)2+(D2−f1)2−(D1−f1)2] (14)
b=[2f1{(D2−f1)D2−(D1−f1)D1}]
c=[(f1D1)2−(f1D2)2]
From the second image 502:
The overbar denotes quantities in pixel based/pixel reference units.
Therefore, using equation (18) the blur difference Δ(r2) is able to be determined.
Hence, a first image with a first blur quantity and a second image with a second blur quantity are able to be used to determine a variance delta Δ
By determining the depth d0, a device is able to utilize the depth information for applications such as autofocusing.
The pillbox (r) and gaussian (σ) blur kernels have the following approximate relationship with respect to filtered energy: r˜2σ.
Therefore, taking a first image with pillbox blur equal to the value r1 and convolving it with the approximate telescoping solution of [3×3 kernel] for 1 to n iterations yields a pillbox blur equal to the value r2.
Through observations, better performance occurs when the difference in blur in the first image to the second image is small. Furthermore, it is helpful for the blur quantities to be small.
There are a number of devices that are able to utilize the method of receiving multiple blurred images to generate a depth map. Such a device obtains a signal of an image from a scene. The signal passes through a lens and is received by an image sensor. The lens is then moved a short distance and the signal passes through again and is received by the image sensor. With different distances between the lens and the sensor, the signals of the images arrive at the sensors with differing blur quantities. An improved method of simulating gaussian blur and an approximation equation that relates a known gaussian blur quantity to a known pillbox quantity are used in conjunction with a non-linear blur difference equation. The blur quantity difference is able to be calculated and then used to determine the depth map. With the depth information, tasks like image segmentation and object detection are better performed. Many applications are possible with the method and system described herein, including, but not limited to autofocusing, surveillance, robot/computer vision, autonomous vehicle navigation, multi-dimensional imaging and data compression. For a user of the device which implements the method described herein, the functionality is similar to that of other related technologies. For example, a person who is taking a picture with a camera which implements the method to receive multiple blurred images, uses the camera as a generic autofocusing camera. The camera generates a depth map, and then automatically focuses the lens until it establishes the proper focus for the picture, so that the user is able to take a clear picture. However, as described above, the method and system described herein have significant advantages over other autofocusing devices.
In operation, the method and system for receiving multiple blurred images to determine a depth map improve a device's ability to perform a number of functions such as autofocusing. As described above, when a user is utilizing a device which implements the method and system described herein, the device functions as a typical device would from the user's perspective. The improvements of being able to compute a depth map using multiple blurred images received at a sensor by moving a lens a short distance enable more accurate and improved autofocusing. An improved method of simulating gaussian blur assists in determining the depth for autofocusing. Using non-linear blur difference instead of linear provides better accuracy as well. The approximate depth map is generated by implementing a pillbox blur based imaging system using an algorithm based on a gaussian blur approximation.
In other embodiments, the image sensor is moved in addition to or instead of the lens to acquire pictures with different blur quantities.
The present invention has been described in terms of specific embodiments incorporating details to facilitate the understanding of principles of construction and operation of the invention. Such reference herein to specific embodiments and details thereof is not intended to limit the scope of the claims appended hereto. It will be readily apparent to one skilled in the art that other various modifications may be made in the embodiment chosen for illustration without departing from the spirit and scope of the invention as defined by the claims.
Number | Name | Date | Kind |
---|---|---|---|
4349254 | Jyojiki et al. | Sep 1982 | A |
4751570 | Robinson | Jun 1988 | A |
4947347 | Sato | Aug 1990 | A |
4965840 | Subbarao | Oct 1990 | A |
5148209 | Subbarao | Sep 1992 | A |
5365597 | Holeva | Nov 1994 | A |
5577130 | Wu | Nov 1996 | A |
5604537 | Yamazaki et al. | Feb 1997 | A |
5703637 | Miyazaki et al. | Dec 1997 | A |
5752100 | Schrock | May 1998 | A |
6177952 | Tabata et al. | Jan 2001 | B1 |
6229913 | Nayar et al. | May 2001 | B1 |
6683652 | Ohkawara et al. | Jan 2004 | B1 |
6829383 | Berestov | Dec 2004 | B1 |
6876776 | Recht | Apr 2005 | B2 |
6891966 | Chen | May 2005 | B2 |
6925210 | Herf | Aug 2005 | B2 |
7019780 | Takeuchi et al. | Mar 2006 | B1 |
7035451 | Harman et al. | Apr 2006 | B2 |
20030067536 | Boulanger et al. | Apr 2003 | A1 |
20030231792 | Zhang et al. | Dec 2003 | A1 |
20040027450 | Yoshino | Feb 2004 | A1 |
20040036763 | Swift et al. | Feb 2004 | A1 |
20050104969 | Schoelkopf et al. | May 2005 | A1 |
20050265580 | Antonucci et al. | Dec 2005 | A1 |
20060120706 | Cho et al. | Jun 2006 | A1 |
20060221179 | Seo et al. | Oct 2006 | A1 |
20060285832 | Huang | Dec 2006 | A1 |
20070040924 | Cho et al. | Feb 2007 | A1 |
Number | Date | Country |
---|---|---|
10108152 | Apr 1998 | JP |
2004048644 | Dec 2004 | JP |
Number | Date | Country | |
---|---|---|---|
20070297784 A1 | Dec 2007 | US |