Not Applicable.
Certain embodiments of the invention relate to video communication and processing. More specifically, certain embodiments of the invention relate to a method and system for motion vector estimation using a pivotal pixel search.
In many video processing applications, in which moving objects may be displayed in a sequence of image frames, it may be useful to have knowledge of the motion which occurs from frame to frame. Examples of such applications include, frame rate conversion, deinterlacing, noise reduction, and cross-chroma reduction. In a typical method for frame rate conversion, for example one that enables doubling of the frame rate of a video sequence, each image frame may be repeated twice. By taking this information into account, one can perform adaptive processing that adapts to and compensates for the motion in the scene.
There have been many methods proposed for modeling the motion in a scene. One such method is a translational block-based model. In this model, the original frame is broken into small blocks, and the motion between frames is modeled in terms of translational shifts of these blocks. Each block is assigned a two-dimensional (horizontal/vertical) motion vector (MV) that describes the translational shift assigned to each block.
Further limitations and disadvantages of conventional and traditional approaches will become apparent to one of skill in the art, through comparison of such systems with some aspects of the present invention as set forth in the remainder of the present application with reference to the drawings.
A method and system for motion vector estimation using a pivotal pixel search, substantially as shown in and/or described in connection with at least one of the figures, as set forth more completely in the claims.
These and other advantages, aspects and novel features of the present invention, as well as details of an illustrated embodiment thereof, will be more fully understood from the following description and drawings.
Certain embodiments of the invention relate to a method and system for motion vector estimation using a pivotal pixel search. Various embodiments of the invention comprise a method and system in which an interpolated image frame may be generated by selecting a neighborhood of picture element (pixel) block locations within the interpolated image frame. For each selected pixel block, a plurality of candidate motion vectors may be selected. Within each selected pixel block, each of the plurality of candidate motion vectors may be evaluated based on one or more selection criteria. One of the candidate motion vectors may be selected based on the one or more selection criteria. Based on the selected motion vector, corresponding pixel block locations may be determined within a preceding image frame and a current image frame, wherein the interpolated image frame may be temporally located between the preceding image frame and the current image frame. Based on the determined pixel block locations within the preceding and current image frames a corresponding pixel block may be generated within the interpolated image frame.
In various embodiments of the invention, motion vectors may be computed utilizing various methods and/or techniques. While one or more exemplary methods for motion vector computation may be described, implied and/or suggested below, for the purposes of this application various embodiments of the invention are not limited to any specific method for motion vector computation.
In an exemplary embodiment of the invention, the candidate motion vector 112a may represent a motion vector computed based on the preceding image processing block 104g and the current image processing block 104c within the current image frame 102b. A correlation error value, σ112a, may be computed based on a correlation computation between image processing blocks 104g and 104c. The candidate motion vector 112b may represent a motion vector computed based on the preceding image processing block 104h and the current image processing block 104b. A correlation error value, σ112b, may be computed based on a correlation computation between image processing blocks 104h and 104b. The candidate motion vector 112c may represent a motion vector computed based on the preceding image processing block 104a and the current image processing block 104d. A correlation error value, 112c, may be computed based on a correlation computation between image processing blocks 104h and 104b. The candidate motion vector 112d may represent a motion vector computed based on the preceding image processing block 104i and the current image processing block 104e. A correlation error value, σ112d, may be computed based on a correlation computation between image processing blocks 104i and 104e. The candidate motion vector 112e may represent a motion vector computed based on the preceding image processing block 104j and the current image processing block 104f. A correlation error value, σ112e, may be computed based on a correlation computation between image processing blocks 104j and 104f.
An interpolated image processing block 106 may be selected within an interpolated image frame 102c. The interpolated image frame 102c may be temporally located between the preceding image frame 102a and the current image frame 102b. The candidate motion vectors 112a, 112b, 112c, 112d and 112e may comprise a portion of selected motion vectors that intersect in the vicinity of a pixel location 108 within the interpolated image processing block 106. The pixel location 108 may be referred to as a pivot pixel. The candidate motion vectors 112a, 112b, 112c, 112d and 112e may be evaluated based on one or more selection criteria for the selected interpolated image processing block 106. In an exemplary embodiment of the invention, the correlation error values σ112a, σ112b, (σ112c, σ112d and σ112e may be compared. A minimum correlation error value may be determined. A motion vector may be selected from the plurality of candidate motion vectors based on the minimum correlation error value. In the exemplary diagram shown in
Based on the selected motion vector 112c, the pixel values within in the interpolated image processing block 106 may be computed based on the pixel values within the preceding and current image processing blocks, which correspond to the selected motion vector 112c. In the exemplary diagram shown in
In some conventional image processing systems, a candidate motion vector may be selected and a location for the interpolated image processing block determined based on the selected motion vector. One potential shortcoming with this approach is that when a set of interpolated image processing blocks is generated for the interpolated image frame, some of the interpolated image processing blocks may overlap. For example, some of the pixel locations within the interpolated image frame that are located within a given P×Q pixel neighborhood of a given interpolated image processing block may also be located within at least one other P×Q pixel neighborhood of at least one other interpolated image processing block. Furthermore, some of the pixel locations within the interpolated image frame may not be located within any of the interpolated image processing blocks. These pixel locates may be referred to as “holes” in the interpolated image frame.
In some conventional image processing systems, the location of an image processing block within the interpolated image frame may determined based on the location of the preceding image processing block within the preceding image frame, or based on the location of the current image processing block within the current image frame. One potential shortcoming with this approach is that a video sequence comprising the input video frames and interpolated image frames may present a pattern of misalignment in motion from one frame to the next.
The image interpolation system 202 may comprise a delay block 212, a pixel generation block 214 and an image frame generation block 216. The delay block 212 may receive input video 200 and output a time delayed version of the input video. In an exemplary embodiment of the invention, the delay block 212 may insert a one image frame time delay between the received input video 200 and the output. The delay block 212 may receive one or more current image frames and output a one image frame time delayed version of the input current image frames. The time delayed version of the input current image frames may be referred to as preceding image frames.
The pixel generation block 214 may comprise suitable logic, circuitry and/or code that may enable reception of one or more current image frames, one or more preceding image frames and a plurality of candidate motion vectors 220 as input. Based on these inputs, the pixel generation block 214 may enable generation of interpolated image processing blocks. In various embodiments of the invention, the pixel generation block 214 may select a location of the interpolated image processing block within an interpolated image frame. In an exemplary embodiment of the invention, the pixel generation block 214 may select a candidate motion vector based on the selected location for the interpolated image processing block. In various embodiments of the invention, this method may be referred to as a pivot pixel motion vector search method.
The pixel generation block 214 may comprise suitable logic, circuitry and/or code that may enable selection of a preceding image processing block within the preceding image frame and a current image processing block within the current image frame based on the selected motion vector. The pixel generation block 214 may generate pixel values within the interpolated image processing block based on the corresponding pixel values within the selected preceding and current image frames.
The image frame generation block 216 may comprise suitable logic, circuitry and/or code that may enable generation of interpolated image frames based on received interpolated image processing blocks. In an exemplary embodiment of the invention, the image frame generation block 216 may receive interpolated image processing blocks generated by the pixel generation block 214. The image frame generation block 216 may determine whether a sequence of received interpolated image processing blocks are contained within the same interpolated image frame. The image frame generation block 216 may determine the location of each received interpolated image processing block within an interpolated image frame. Upon assembling the group of interpolated image processing blocks associated with a given interpolated image frame the image frame generation block 216 may output a completed interpolated image frame.
The input video 200 may comprise a sequence of image frames. Each image frame may be represented as an M×N pixel block, where M represents the number of lines in the image frame and N represents the number of pixels within each line. The M×N pixel block, which is utilized in level 0 of the motion vector computation hierarchy, may represent a full pixel image frame.
The sub-sample 2×2 block 302 may comprise suitable logic, circuitry and/or code that may utilize a subsampling ratio of 2×2. The sub-sample 2×2 block 302 may receive an M×N pixel block and generate a level 1 subsampled image frame comprising a (½M)×(½N) pixel block. The level 1 subsampled image frame, which may be utilized in level 1 of the motion vector computation hierarchy, may represent a half pixel image frame.
The sub-sample 2×2 block 304 may comprise suitable logic, circuitry and/or code that may utilize a subsampling ratio of 2×2, which when combined with the sub-sample 2×2 block 302 may create an effective subsampling ratio of 4×4. The sub-sample 2×2 block 304 may receive a (½M)×(½N) pixel block and generate a level 2 subsampled image frame comprising a (¼M)×(¼N) pixel block. The level 2 subsampled image frame, which may be utilized in level 2 of the motion vector computation hierarchy, may represent a quarter pixel image frame.
The quarter pixel block 306 may comprise suitable logic, circuitry and/or code that may enable computation of motion vectors based on a current quarter pixel image frame and a preceding quarter pixel image frame. In various embodiments of the invention, the motion vectors computed by the quarter pixel block 306 may utilize quarter pixel resolution. A pixel neighborhood, comprising a pixel block (where the pixel block is smaller than the image frame size), at a selected location within the preceding quarter pixel image frame may be selected as a level 2 preceding image processing block. A plurality of motion vectors may be computed by computing a correlation value between the level 2 preceding image processing block and each pixel block within a specified level 2 pixel motion vector search area within a current quarter pixel image frame. The pixel locations within the specified level 2 current pixel motion vector search area may correspond to the set of pixel locations within the preceding quarter pixel image frame from which the level 2 preceding image processing block is selected. The quarter pixel block 306 may utilize an interpolation filter to enable the computation of level 2 motion vectors at subpixel accuracy. The quarter pixel block 306 may enable the generation of interpolated pixel locations within each pixel block in the current and preceding quarter pixel image frames. This increases the number of pixel locations within each of the post-interpolation quarter pixel image frames and thereby enables the computation of level 2 motion vectors at subpixel accuracy. A maximum correlation value may indicate a location of a level 2 current image processing block within the current quarter pixel image frame, which corresponds to the level 2 preceding image processing block. In an exemplary embodiment of the invention, a level 2 motion vector may be computed based on the location of the level 2 preceding image processing block and the corresponding level 2 current image processing block.
The half pixel block 308 may comprise suitable logic, circuitry and/or code that may enable computation of level 1 motion vectors based on a current half pixel image frame, a preceding half pixel image frame and one or more computed level 2 motion vectors. In various embodiments of the invention, the level 1 motion vectors computed by the half pixel block 308 may utilize half pixel resolution. In an exemplary embodiment of the invention a pixel neighborhood, comprising a pixel block at a selected location within the preceding half pixel image frame may be selected as a level 1 preceding image processing block. The center location for the selected level 1 preceding image processing block may be determined based on a level 2 motion vector, which was computed as described above. In addition, a level 1 current pixel motion vector search area may be selected within the current half pixel image frame. The center location for the selected level 1 current pixel motion vector search area may be determined based on the computed level 2 motion vector.
A plurality of level 1 motion vectors may be computed by computing a correlation value between the level 1 preceding image processing block and each pixel block within the level 1 current pixel motion vector search area. The half pixel block 308 may utilize an interpolation filter to enable the computation of level 1 motion vectors at subpixel accuracy. The half pixel block 308 may enable the generation of interpolated pixel locations in the level 1 preceding image processing block and in the level 1 current pixel motion vector search area. This increases the number of pixel locations within the post-interpolation level 1 preceding image processing block and the post-interpolation level 1 current pixel motion vector search area and thereby enables the computation of level 1 motion vectors at subpixel accuracy.
The full pixel block 310 may comprise suitable logic, circuitry and/or code that may enable computation of level 0 motion vectors based on a current full pixel image frame, a preceding full pixel image frame and one or more computed level 1 motion vectors. In various embodiments of the invention, the level 0 motion vectors computed by the full pixel block 310 may utilize full pixel resolution. In an exemplary embodiment of the invention a pixel neighborhood, comprising a pixel block at a selected location within the preceding full pixel image frame may be selected as a level 0 preceding image processing block. The center location for the selected level 0 preceding image processing block may be determined based on a level 1 motion vector, which was computed as described above. In addition, a level 0 current pixel motion vector search area may be selected within the current full pixel image frame. The center location for the selected level 0 current pixel motion vector search area may be determined based on the computed level 1 motion vector.
A plurality of level 0 motion vectors may be computed by computing a correlation value between the level 0 preceding image processing block and each pixel block within the level 0 current pixel motion vector search area. The full pixel block 310 may utilize an interpolation filter to enable the computation of level 0 motion vectors at subpixel accuracy. The full pixel block 310 may enable the generation of interpolated pixel locations in the level 0 preceding image processing block and in the level 0 current pixel motion vector search area. This increases the number of pixel locations within the post-interpolation level 0 preceding image processing block and the post-interpolation level 0 current pixel motion vector search area and thereby enables the computation of level 0 motion vectors at subpixel accuracy.
The full pixel block 310 may output a set of computed level 0 motion vectors 220. In various embodiments of the invention, the set of computed level 0 motion vectors 220 may be utilized to enable generation of an interpolated image frame, which may be temporally located between the preceding image frame and the current image frame. The computed level 0 motion vectors 220 may enable computation of the interpolated image frame based on the full pixel resolution level.
Various embodiments of the invention may be practiced with differing numbers of levels in the motion vector computing hierarchy. For example, an exemplary embodiment of the invention may utilize more or less than three (3) levels in the motion vector computing hierarchy. Various embodiments of the invention may be practiced with differing subsampling ratios and/or interpolation ratios. The subsampling ratios may be determined independently from the interpolation ratios and vice versa. Subsampling ratios may be selected independently for each level in the motion vector computing hierarchy. Interpolation ratios may be selected independently for each level in the motion vector computing hierarchy. Various embodiments of the invention may be practiced with preceding and current image frames of varying sizes, with motion vector search areas of varying pixel neighborhood sizes and/or with preceding and current image processing blocks of varying pixel neighborhood sizes. For example, an exemplary embodiment of the invention may utilize 3×3, 5×5 or 9×9 pixel neighborhood sizes for preceding and current image processing blocks. Various embodiments of the invention may be practiced with the roles of the preceding and current images reversed such that motion vectors may be found in both the forward and backward temporal directions.
Aspects of a method and system for motion vector estimation using a pivotal pixel search may comprise an image interpolation system 202 that enables selection of an interpolated picture element neighborhood 106 within an interpolated image frame 102c. The image interpolation system 202 may enable selection of one of a plurality of computed candidate motion vectors 112a, 112b, 112c, 112d and 112e based on the location of the interpolated picture element neighborhood 106 the interpolated image frame 102c. The location of the interpolated picture element neighborhood 106 may be coincident with the center of the interpolated picture element neighborhood 106. The image interpolation system 202 may enable generation of picture element values within the selected interpolated picture element neighborhood 106 based on at least the selected one of the plurality of computed candidate motion vectors 112c.
The interpolated image frame 106 may be temporally located between a preceding image frame 102a and a current image frame 102b. A full pixel block 310 enable generation of the plurality of computed candidate motion vectors 112a, 112b, 112c, 112d and 112e based on the preceding image frame 102a and the current image frame 102b. The image interpolation system 202 may enable selection of one or both of: a preceding picture element neighborhood 104a within the preceding image frame 102a and a current picture element neighborhood 104d within the current image frame 102b. The full pixel block 310 may enable computation of the selected one 112c of the plurality of computed motion vectors 112a, 112b, 112c, 112d and 112e based on the selected preceding picture element neighborhood 104a and the selected current picture element neighborhood 104d. The image interpolation system may enable generation of the picture element values within the selected interpolated picture element neighborhood 106 based on the selected preceding picture element neighborhood 104a and/or the selected current picture element neighborhood 104d.
The plurality of computed candidate motion vectors 112a, 112b, 112c, 112d and 112e may intersect in the vicinity of a selected pixel location 108 within the interpolated image frame 106. The selected pixel location 108 may correspond to a pivot pixel location. The image interpolation system 202 may enable computation of a correlation error value for each of the plurality of computed candidate motion vectors. The image interpolation system 202 may enable determination of a minimum value among the computed plurality of correlation error values. The selected one 112c of the plurality of computed candidate motion vectors may correspond to the determined minimum correlation error value.
In various embodiments of the invention, the plurality of candidate motion vectors 112a, 112b, 112c, 112d and 112e may be generated by the half pixel block 308, for example. The selected motion vector 112c may be utilized by the full pixel block 310 to generate one or more subsequent motion vectors. At least one of the subsequent motion vectors may be utilized by the image interpolation system 202 to enable generation of the picture element values within the selected interpolated picture element neighborhood 106.
Another embodiment of the invention may provide a machine-readable storage having stored thereon, a computer program having at least one code section executable by a machine, thereby causing the machine to perform steps as described herein for motion vector estimation using a pivotal pixel search.
Accordingly, the present invention may be realized in hardware, software, or a combination of hardware and software. The present invention may be realized in a centralized fashion in at least one computer system, or in a distributed fashion where different elements are spread across several interconnected computer systems. Any kind of computer system or other apparatus adapted for carrying out the methods described herein is suited. A typical combination of hardware and software may be a general-purpose computer system with a computer program that, when being loaded and executed, controls the computer system such that it carries out the methods described herein.
The present invention may also be embedded in a computer program product, which comprises all the features enabling the implementation of the methods described herein, and which when loaded in a computer system is able to carry out these methods. Computer program in the present context means any expression, in any language, code or notation, of a set of instructions intended to cause a system having an information processing capability to perform a particular function either directly or after either or both of the following: a) conversion to another language, code or notation; b) reproduction in a different material form.
While the present invention has been described with reference to certain embodiments, it will be understood by those skilled in the art that various changes may be made and equivalents may be substituted without departing from the scope of the present invention. In addition, many modifications may be made to adapt a particular situation or material to the teachings of the present invention without departing from its scope. Therefore, it is intended that the present invention not be limited to the particular embodiment disclosed, but that the present invention will include all embodiments falling within the scope of the appended claims.