A challenge exists to deliver quality and value to consumers, for example, by providing mobile devices, such as cell phones and personal digital assistants, that are cost effective. Additionally, businesses may desire to provide new features to such mobile devices. Further, businesses may desire to enhance the performance of one or more components of such mobile devices.
The following detailed description references the drawings, wherein:
i shows an example of kernels for computing second-order partial derivatives.
Image registration is a technology for transforming different sets of data into one coordinate system. The data may be multiple photographs, multiple video frames, data from different sensors, from different times, or from different viewpoints. Image registration allows this data to be compared or integrated and enables many applications such as video stabilization, tracking, multi-image fusion for high dynamic range, and still image stabilization.
One type of image registration is feature based. In this type of image registration, one of the images is referred to as the reference and the second image is referred to as the target. Feature based image registration determines correspondence between image features such as points, lines, and contours. Once the correspondence between a number of points in the reference and target images is known, a transformation is then determined to map the target image to the reference image.
Many feature based image registration methods are too computationally expensive for real time performance in many mobile device applications. For example, some feature based image registration methods extract features at multiple scales and generate feature descriptors that are invariant to orientation, scale, and intensity. This invariance is needed for matching images obtained from different viewpoints in three-dimensional (3D) computer vision tasks. This invariance comes at a steep computational cost, making these methods impractical for applications that demand real time performance in mobile devices.
A need therefore exists for a feature based image registration method that is accurate and fast enough to enable real time performance in mobile devices. For many applications involving video and bursts of still images, the changes of viewpoint from frame to frame occur relatively slowly. This fact is utilized by the present invention to create such a feature based image registration method.
A block diagram of an example of a feature based image registration method 10 is shown in
Method 10 additionally includes a feature matching component 20 that additionally operates on both first image 12 and second image 14. As will be additionally discussed in more detail below, feature matching component 20 selects pairs of key points for each of first image 12 and second image 14 based on a measure of closeness of their feature descriptors. Method 10 further includes a geometric transform estimation module 22. As will be further discussed below in more detail, geometric transform estimation module 22 utilizes a list of matching pairs of key points selected by feature matching module 20 and the positions of such key points to map reference image 12 into target image 14.
A block diagram of an example of a feature extraction method 24 (for feature extraction component 16) is shown in
Feature extraction method 24 includes the element or component 28 of generating a blurred input image which involves convolving input image 26 with a two-dimensional box filter to create a box filtered image. The dimensions and size of this box filter can vary. For example, an N×N box filter, where N=8, may be used for video image applications. As another example, an N×N box filter, where N can vary between 8 and 32, may be used for still image applications.
An arbitrary size box filter of N×N can be computed efficiently with only four (4) operations (two (2) adds and two (2) subtracts) per pixel in input image 26. This is done by maintaining a one-dimensional (1D) array 30 which stores the sum of N consecutive image rows, for example the first eight (8) rows, where N=8, as generally illustrated in
Referring again to
and the determinant of H is: detH=fxxfyy−fxyfxy. This means that the determinant of the Hessian matrix (detH) is only computed for 1/16th of the blurred input image which increases the speed of method 24. Examples of the kernels fxx, fyy, and fxy used in computing the second-order partial derivatives are shown in
Referring again to
The pre-determined image dependent threshold can be calculated as follows. The laplacian of the first input image 26 is computed in coarse grid 38. The laplacian is computed with the kernel:
This computation is performed only for every 1/16h row and every 1/16th column. The initial threshold is given by: ThI=2 sdev (lapi), where sdev is the standard deviation of lapi. Using ThI on detH results in an initial number of feature points. If this is larger than the target, ThI is reduced until the target is reached. This is efficiently done using a histogram of the values of detH. If numI represents the initial number of feature points and numT represents the targeted number, then for the next input image 26 the lap is not computed and the initial threshold is computed as: ThI(k+1)=(0.9 numI/NumT)ThI(k), where ThI(k+1) is the next input image 26 and ThI(k) is the previous input image 26.
Method 24 further includes the element or component 44 of determining the high resolution feature points in the blurred input image. This is accomplished by applying a fine grid 46 shown in
Referring again to
Referring again
Referring again
Once one or more matching pairs of feature points are determined by feature matching module 20, feature based image registration method 10 proceeds to geometric transform estimation module 22. Module 22 utilizes the matching pairs of feature points and their positions to estimate a global affine transformation that maps first or reference image 12 into second or target image 14. Robustness against outliers is obtained by using either random sample consensus (RANSAC) or M-Estimation. Other approaches (e.g., a robust mean or utilization of the median of the motion vectors defined by matching pairs of feature points) can be used if correction of only translation is required, rather than translation, rotation, scaling and shear. These approaches also tend to be computationally less expensive and faster.
Although several examples have been described and illustrated in detail, it is to be clearly understood that the same are intended by way of illustration and example only. These examples are not intended to be exhaustive or to limit the invention to the precise form or to the exemplary embodiments disclosed. Modifications and variations may well be apparent to those of ordinary skill in the art. The spirit and scope of the present invention are to be limited only by the terms of the following claims.
Additionally, reference to an element in the singular is not intended to mean one and only one, unless explicitly so stated, but rather means one or more. Moreover, no element or component is intended to be dedicated to the public regardless of whether the element or component is explicitly recited in the following claims.
Number | Name | Date | Kind |
---|---|---|---|
6510244 | Proesmans et al. | Jan 2003 | B2 |
6711293 | Lowe | Mar 2004 | B1 |
7440008 | Lai et al. | Oct 2008 | B2 |
7548659 | Ofek et al. | Jun 2009 | B2 |
7715654 | Chefd'hotel et al. | May 2010 | B2 |
8027514 | Takaki et al. | Sep 2011 | B2 |
8165401 | Funayama et al. | Apr 2012 | B2 |
20070086678 | Chefd'hotel et al. | Apr 2007 | A1 |
20090066800 | Wei | Mar 2009 | A1 |
20090087023 | Porikli et al. | Apr 2009 | A1 |
20090123082 | Atanssov et al. | May 2009 | A1 |
20090238460 | Funayama et al. | Sep 2009 | A1 |
20100092093 | Akatsuka et al. | Apr 2010 | A1 |
20100157070 | Mohanty et al. | Jun 2010 | A1 |
20100239172 | Akiyama | Sep 2010 | A1 |
20110028844 | Hyun et al. | Feb 2011 | A1 |
20110038540 | Ahn et al. | Feb 2011 | A1 |
20110085049 | Dolgin et al. | Apr 2011 | A1 |
20120183224 | Kirsch | Jul 2012 | A1 |
Entry |
---|
Wei-Ting Lee; Hwann-Tzong Chen; , “Histogram-based interest point detectors,” Computer Vision and Pattern Recognition, 2009. CVPR 2009. |
Chen et al, Efficient Extraction of Robust Image Features on Mobile Devices, ISMAR '07 Proceedings of the 2007 6th IEEE and ACM International Symposium on Mixed and Augmented Reality, pp. 1-2. |
Bay et al, SURF: Speeded Up Robust Features, Leonardis, H. Bischof, and A. Pinz (Eds.): ECCV 2006, Part I, LNCS 3951, pp. 404-417, 2006. |
Bay, et al. SURF: Speeded up robust features. In ECCV, 2006. |
Buehler, C., et al. Non-Metric Image-Based Rendering for Video Stabilization. IEEE Computer Vision and Pattern Recognition Conference. vol. 2. pp. 609-614. Dec. 2001. |
Ikemura, et al. Real-Time Human Detection Using Relational Depth Similarity Features. Dept. of Computer Science, Chubu University. |
Litvin, et al. Probablistic video stabilization using kalman filtering and mosaicking. ECE Department. Boston University. Boston, MA 02215. |
Liu, et al. Content Preserving Warps for 3D Video Stabilization. ACM Transactions on Graphics. Article No. 44, 2009. |
Lowe, D.G. Distinctive Image Features from Scale Invariant Keypoints. International Journal of Computer Vision. 60 (2): 91-110. 2004. |
Matsushita, et al. Full-frame video stabilization with motion inpainting. IEEE Transactions on Pattern Analysis and Machine Intelligence, pp. 1150-1163. 2006. |
Ratakonda, K. Real-Time digital video stabilization for multi-media applications. Dept. of Electrical and Computer Engineering. University of Illinois. 1998. IEEE. |
Number | Date | Country | |
---|---|---|---|
20130028519 A1 | Jan 2013 | US |