The present invention relates to the art of detecting motion vectors on the basis of a series of frames in a video signal.
Display devices of the hold type, typified by liquid crystal display (LCD) devices, have the particular problem that moving objects in a moving picture appear blurred to the viewer because the same displayed image is held for a fixed interval (one frame interval, for example) during which it is continuously displayed. The specific cause of the apparent blur is that while the viewer's gaze moves to track the moving object, the object does not move during the intervals in which it is held, creating a difference between the actual position of the object and the viewer's gaze. A known means of alleviating this type of motion blur is frame interpolation, which increases the number of frames displayed per unit time by inserting interpolated frames into the frame sequence. Another technique is to generate high-resolution frames from a plurality of low-resolution frames and then generate the interpolated frames from the high-resolution frames to provide a higher-definition picture.
In these frame interpolation techniques it is necessary to estimate the pixel correspondence between the frames, that is, to estimate the motion of objects between frames. The block matching method, in which each frame is divided into a plurality of blocks and the motion of each block is estimated, is widely used as a method of estimating the motion of objects between frames. The block matching method generally divides one of two temporally consecutive frames into blocks, takes each of these blocks in turn as the block of interest, and searches for a reference block in the other frame that is most highly correlated with the block of interest. The difference in position between the most highly correlated reference block and the block of interest is detected as a motion vector. The most highly correlated reference block can be found by, for example, calculating the absolute values of the brightness differences between pixels in the block of interest and a reference block, taking the sum of the calculated absolute values, and finding the reference block with the smallest such sum.
A problem with the conventional block matching method is that since each block has a size of, say, 8×8 pixels or 16×16 pixels, image defects occur at the block boundaries in the interpolated frames generated using the motion vectors found by the block matching method, and the picture quality is reduced. This problem could be solved if it were possible to detect motion vectors accurately on a pixel basis (with a precision of one pixel). The problem is that it is difficult to improve the accuracy of motion vector estimation on a pixel basis. The motion vector detected for each block can be used as the motion vector of each pixel in the block, for example, but then all pixels in the block show the same motion, so the motion vectors of the individual pixels have not been detected accurately. It is also known that reducing the size of the blocks used to estimate detect motion vectors on a pixel basis does not improve the accuracy of motion vector estimation. A further problem is that reducing the block size greatly increases the amount of computation.
Techniques for generating motion vectors on a pixel basis from block motion vectors are disclosed in Japanese Patent No. 4419062 (Patent Reference 1), Japanese Patent No. 4374048 (Patent Reference 2), and Japanese Patent Application Publication No. H11-177940 (Patent Reference 3). The methods disclosed in Patent References 1 and 3 take, as candidates, the motion vector of the block including the pixel of interest (the block of interest) in one of two temporally distinct frames and the motion vectors of blocks adjacent the block of interest, and find the difference in pixel value between the pixel of interest and the pixels in positions in the other frame shifted per the candidate motion vectors from the position of the pixel of interest. From among the candidate motion vectors, the motion vector with the smallest difference is selected as the motion vector of the pixel of interest (as its pixel motion vector). The method disclosed in Patent Reference 2 seeks further improvement in detection accuracy by, when pixel motion vectors have already been determined, adding the most often used pixel motion vector as an additional candidate motion vector.
As described above, the methods in Patent References 1 to 3 select the motion vector of the pixel of interest from among candidate block motion vectors. However, there is a problem in that if there are periodic spatial patterns (repetitive patterns such as stripe patterns with high spatial frequencies) and noise in the image, this interferes with the selection of accurate motion vectors with high estimation accuracy.
In view of the above, an object of the present invention is to provide a motion vector detection device, motion vector detection method, frame interpolation device, and frame interpolation method that can restrict the lowering of pixel motion vector estimation accuracy due to the effects of periodic spatial patterns and noise appearing in the image.
A motion vector detection device according to a first aspect of the invention detects motion in a series of frames constituting a moving image. The motion vector detection device includes: a motion estimator for dividing a frame of interest in the series of frames into a plurality of blocks, and for, taking a frame temporally differing from the frame of interest in the series of frames as a reference frame, estimating motion of each of the blocks between the frame of interest and the reference frame, thereby detecting block motion vectors; and a motion vector densifier for, based on the plurality of blocks, generating a plurality of sub-blocks on a plurality of layers including a first layer to an N-th layer (N being an integer equal to or greater than 2) and generating a motion vector for each one of the sub-blocks, based on the block motion vectors. The motion vector densifier includes: a first motion vector generator for taking each block in the plurality of blocks as a parent block, generating a plurality of sub-blocks on the first layer from the parent block, and generating a motion vector for each of the plurality of sub-blocks on the first layer, based on the block motion vectors; a second motion vector generator for generating, in the plurality of layers from the first to the N-th layer, a plurality of sub-blocks on each layer from the second to the N-th layer based on parent sub-blocks, the parent sub-blocks being the sub-blocks on a higher layer which is at one level higher than each layer, and for generating a motion vector for each of the plurality of sub-blocks on each of the layers from the second to the N-th layer, based on the motion vectors of the sub-blocks on the higher layer; and a motion vector corrector for, on at least one layer to be corrected among the first to the N-th layers, taking each of the plurality of sub-blocks on the layer to be corrected as a sub-block to be corrected, and correcting the motion vector of the sub-block to be corrected so as to minimize a sum of distances between the motion vector of the sub-block to be corrected and motion vectors belonging to a set including the motion vector of the sub-block to be corrected and motion vectors of neighboring sub-blocks located in an area surrounding the sub-block to be corrected. The second motion vector generator uses the motion vectors as corrected by the motion vector corrector to generate the motion vector of each of the plurality of sub-blocks in the layer following the layer to be corrected.
A frame interpolation device according to a second aspect of the invention includes the motion vector detection device according to the first aspect and an interpolator for generating an interpolated frame on a basis of the sub-block motion vectors detected by the motion vector detection device.
A motion vector detection method according to a third aspect of the invention detects motion in a series of frames constituting a moving image. The motion vector detection method includes: a motion estimation step of dividing a frame of interest in the series of frames into a plurality of blocks, taking a frame temporally differing from the frame of interest in the series of frames as a reference frame, and estimating motion of each of the blocks between the frame of interest and the reference frame, thereby detecting block motion vectors; and a motion vector densifying step of generating a plurality of sub-blocks on a plurality of layers including a first layer to an N-th layer (N being an integer equal to or greater than 2) and generating a motion vector for each one of the sub-blocks, based on the block motion vectors. The motion vector densifying step includes: a first motion vector generation step of taking each block in the plurality of blocks as a parent block, generating a plurality of sub-blocks on the first layer from the parent block, and generating a motion vector for each of the plurality of sub-blocks on the first layer, based on the block motion vectors; a second motion vector generation step of generating, in the plurality of layers from the first to the N-th layer, a plurality of sub-blocks on each layer from the second to the N-th layer based on parent sub-blocks, the parent sub-blocks being the sub-blocks on a higher layer which is at one level higher than each layer, and for generating a motion vector for each of the plurality of sub-blocks on each of the layers from the second to the N-th layer, based on the motion vectors of the sub-blocks on the higher layer; and a correction step of, on at least one layer to be corrected among the first to the N-th layers, taking each of the plurality of sub-blocks on the layer to be corrected as a sub-block to be corrected, and correcting the motion vector of the sub-block to be corrected so as to minimize a sum of distances between the motion vector of the sub-block to be corrected and motion vectors belonging to a set including the motion vector of the sub-block to be corrected and motion vectors of neighboring sub-blocks located in an area surrounding the sub-block to be corrected. The second motion vector generation step uses the corrected motion vectors to generate the motion vector of each of the plurality of sub-blocks in the layer following the layer to be corrected.
A frame interpolation method according to a fourth aspect of the invention includes the motion estimation step and the motion vector densifying step of the motion vector detection method according to the third aspect, and a step of generating an interpolated frame on a basis of the sub-block motion vectors detected in the motion vector densifying step.
According to the present invention, the lowering of pixel motion vector estimation accuracy due to the effects of periodic spatial patterns and noise appearing in the image can be restricted.
Embodiments of the invention will now be described with reference to the attached drawings.
As schematically shown in
As the method of detecting motion vectors MV0(1), MV0(2), MV0(3), . . . , (motion vectors MV0), the known block matching method may be used. With the block matching method, in order to evaluate the degree of correlation between a reference block RBf and the block of interest CB0, an evaluation value based on the similarity or dissimilarity between these two blocks is determined. Various methods of calculating the evaluation value have been proposed. In one method that can be used, the absolute values of the block-to-block differences in the brightness values of individual pixels are calculated and summed to obtain a SAD (Sum of Absolute Difference), which is used as the evaluation value. The smaller the SAD becomes, the greater the similarity between the blocks to be compared becomes (in other words, the dissimilarity becomes less).
Ideally, the range searched to find the reference block RBf covers the entire reference frame Fa, but since it requires a huge amount of computation to calculate the evaluation value for all locations, it is preferable to search in a restricted range centered on the position corresponding to the position of the block of interest CB0 in the frame.
This embodiment uses the block matching method as a preferred but non-limiting method of detecting motion vectors; that is, it is possible to use an appropriate method other than the block matching method. For example, instead of the block matching method, the motion estimator 120 may use a known gradient method (e.g., the Lucas-Kanade method) to generate block motion vectors MV0 at high speed.
The motion vector densifier 130 hierarchically subdivides each of the blocks MB(1), MB(2), MB(3), . . . , thereby generating first to N-th layers of sub-blocks (N being an integer equal to or greater than 2). The motion vector densifier 130 also has the function of generating a motion vector for each sub-block on each layer.
In the example in
Depending on the size and reduction ratio of a sub-block, in some cases the size (the number of horizontal pixels and the number of vertical pixels) does not take an integer value. In such cases, the digits after the decimal point may be rounded down or rounded up. In some cases, sub-blocks generated by subdivision of different parent blocks (or sub-blocks) may overlap in the same frame. Such cases can be dealt with by selecting one of the parent blocks (or sub-blocks) and selecting the sub-blocks generated from the selected parent.
The basic operations of the hierarchical processing sections 1331 to 133N are all the same. The process in the hierarchical processing section 133k will now be described in detail, using the blocks MB(1), MB(2), . . . processed in the first hierarchical processing section 1331 as 0-th layer sub-blocks SB0(1), SB0(2), . . . .
In the motion vector generator 134k, the candidate vector extractor 142k takes sub-blocks SBk(1), SBk(2), SBk(3), . . . one by one in turn as the sub-block of interest CBk, and extracts at least one candidate vector CVk from the set of motion vectors of the sub-blocks SBk=1(1), SBk=1(2) SBk=1(3), . . . , on the higher layer which is at one level higher than the k-th layer for the sub-block of interest CVk. The extracted candidate vector CVk is sent to the evaluator 143k.
After that, the candidate vector extractor 142k selects a group of sub-blocks in an area surrounding the parent sub-block SBk=1(i) on the (k=1)-th layer (step S14), and places the motion vectors of the sub-blocks in this group in the candidate vector set Vk(j) (step S15).
Next, the candidate vector extractor 142k determines whether or not the sub-block number j has reached the total number Nk of sub-blocks belonging to the k-th layer (step S16). If the sub-block number j has not reached the total number Nk (No in step S16), the sub-block number j is incremented by 1 (step S17) and the process returns to step S11. When the sub-block number j reaches the total number Nk (Yes in step S16), the candidate vector extraction process ends.
Not all of the sub-blocks SBk=1(a) to SBk=1(h) neighboring the parent sub-block SBk=1(i) need be selected in step S14. Furthermore, this embodiment is also workable in cases in which sub-blocks surrounding but not adjacent to sub-block SBk=1(i) are selected or cases in which a sub-block is selected from another frame temporally adjacent to the frame Fb to which the parent sub-block SBk=1(i) belongs (e.g., a sub-block at a position corresponding to the position of sub-block SBk=1(i) in the other frame).
In step S14, sub-blocks may also be selected from an area other than the area adjacent in eight directions to the parent sub-block SBk=1(i). For example, as shown in
Furthermore, the reduction ratio α is not limited to 1/2.
After the candidate vector is selected as described above, the evaluator 143k extracts reference sub-blocks RB with coordinates (Xr+CVx, Yr+CVy) at positions shifted from the position (Xr, Yr) in the reference frame Fa corresponding to the position pos=(Xc, Yc) of the sub-block of interest CBk by the candidate vectors CVk. Here, CVx and CVy are the horizontal pixel direction component (X component) and vertical pixel direction component (Y component) of the candidate vectors CVk, and the size of the reference sub-block RB is identical to the size of the sub-block of interest CBk. For example, as shown in
In addition, the evaluator 143k calculates the similarity or dissimilarity of each pair of sub-blocks consisting of an extracted reference sub-block RB and the sub-block of interest CBk, and based on the calculation result, it determines the evaluation value Ed of the candidate vector. For example, the sum of absolute differences (SAD) between the pair of blocks may be calculated as the evaluation value Ed. In the example in
On the basis of the evaluation values, the motion vector determiner 144k now selects the most likely motion vector from the candidate vector set Vk(j) as the motion vector MVk of the sub-block of interest CBk (=SBk(j)). The motion vector MVk is output to the next stage via the output unit 145k.
The motion vector determiner 144k can select the motion vector by using the following expression (1).
Here, vi is a candidate vector belonging to the candidate vector set Vk; fa(x) is the value of a pixel in the reference frame Fa indicated by a position vector x; fb(x) is the value of a pixel in the frame of interest Fb indicated by a position vector x; B is a set of position vectors indicating positions in the sub-block of interest; pos is a position vector belonging to set B. SAD(vi) is a function that outputs the sum of the absolute differences between a pair of sub-blocks, namely a reference sub-block and the sub-block of interest; arg min (SAD(vi)) gives the vi (=vt) that minimizes SAD(vi).
In this way, the motion vector MVk(=vt) most likely to represent the true motion can be selected on the basis of the SAD. Alternatively, the evaluation value Ed may be calculated by using a definition differing from the SAD definition.
Next the motion vector corrector 137k in
The motion vector corrector 137k has a filtering function that takes each of the sub-blocks SBk(1), . . . , SBk(Nk) on the k-th layer in turn as the sub-block of interest and corrects its motion vector on the basis of the motion vectors of the neighboring sub-blocks located in the area surrounding the sub-block of interest. When an erroneous motion vector MVk is output from the motion vector generator 134k, this filtering function can prevent the erroneous motion vector MVk from being transmitted to the hierarchical processing section 133k+1 in the next stage, or to the output unit 138.
When the motion vector of the sub-block of interest clearly differs from the motion vectors of the sub-blocks in its surrounding area, use of a smoothing filter could be considered in order to eliminate the anomalous motion vector and smooth the distribution of sub-block motion vectors. However, the use of a smoothing filter might produce a motion vector representing non-existent motion.
If the motion vector of the sub-block of interest is erroneously detected as (9, 9) and the motion vectors of the eight sub-blocks neighboring the sub-block of interest are all (0, 0), for example, a simple smoothing filter (an averaging filter which takes the arithmetic average of multiple motion vectors) with an application range (filter window) of 3 sub-blocks×3 sub-blocks would output the vector (1, 1) for the sub-block of interest. This output differs from the more likely value (0, 0), and represents non-existent motion. In frame interpolation and super-resolution, it is preferable to avoid output of vectors not present in the surrounding area.
The motion vector corrector 137k in this embodiment therefore has a filtering function that sets the motion vector of the sub-block of interest (sub-block to be corrected) and the motion vectors of the sub-blocks in the application range (filter window), including sub-blocks surrounding the sub-block of interest, as correction candidate vectors vc, selects a correction candidate vector vc with a minimum sum of distances from the motion vectors of the surrounding sub-blocks and the motion vector of the sub-block of interest, and replaces the motion vector of the sub-block of interest with the selected correction candidate vector. Various mathematical concepts of the distance between two motion vectors are known, such as Euclidean distance, Manhattan distance, Chebyshev distance, etc.
This embodiment employs Manhattan distance as the distance between the motion vectors of the surrounding sub-blocks and the motion vector of the sub-block of interest. With Manhattan distance, the following expression (2) can be used to generate a new motion vector vn of the sub-block of interest.
In the above, vc is a correction candidate vector; Vf is a set consisting of the motion vectors of the sub-blocks in the filter window; xc, yc are respectively a horizontal pixel direction component (X component) and a vertical pixel direction component (Y component); xi, yi are respectively an X component and a Y component of a motion vector vi belonging to the set Vf; dif(vc) is a function that outputs the sum of the Manhattan distances between motion vectors vc and vi; arg min(dif(vc)) gives the vc that minimizes dif(vc) as the correction vector vn. Selecting the correction vector vn from the correction candidate vectors vc belonging to the set Vf in this way reliably avoids generating a motion vector representing non-existent motion as a correction vector. An optimization process may be carried out, such as weighting the motion vectors of the sub-blocks as a function of their position in the filter window. For some spatial distributions of the motion vectors of the sub-blocks within the filter window, however, the process of calculating the correction vector vn may be executed without the requirement that the correction candidate vector vc must belong to the set Vf.
After that, the motion vector corrector 137k determines whether or not the sub-block number i has reached the total number Nk of sub-blocks belonging to the k-th layer (step S25); if the sub-block number i has not reached the total number Nk (No in step S25), the sub-block number i is incremented by 1 (step S26), and the process returns to step S21. When the sub-block number i reaches the total number Nk (Yes in step S25), the motion vector correction process ends.
As described above, each hierarchical processing section 133k generates higher density motion vectors MVk based on the motion vectors MVk=1 input from the previous stage, and outputs them to the next stage. The hierarchical processing section 133N in the final stage outputs pixel motion vectors MVN as the motion vectors MV.
As described above, the motion vector densifier 130 in the first embodiment hierarchically subdivides each of the blocks MB(1), MB(2), . . . , thereby generating multiple layers of sub-blocks SB1(1), SB1(2), . . . , SB2(1), SB2(2), . . . , SB3(1), SB3(2), . . . , while generating motion vectors MV1, MV2, . . . , MVN in stages, gradually increasing the density of the motion vectors as it advances to higher layers in the hierarchy. Accordingly, it is possible to generate dense motion vectors MV that are less affected by noise and periodic spatial patterns occurring in the image.
The motion vectors MV1, MV2, . . . , MVN determined on the multiple layers are corrected by the motion vector correctors 1371 to 137N, so in each stage, it is possible to prevent erroneous motion vectors from being transferred to the next stage. Accordingly, motion vectors (pixel motion vectors) MV with high estimation accuracy can be generated from the block motion vectors MV0.
The motion vector densifier 130 as shown in
Next, a second embodiment of the invention will be described.
The motion vector detection device 20 has input units 200a, 200b, and 200c to which three temporally consecutive frames Fa, Fb, and Fc among a series of frames forming a moving image are input, respectively. The motion vector detection device 20 also has a motion estimator 220 for detecting block motion vectors MV0 from the input frames Fa, Fb, and Fc, a motion vector densifier 230 for generating pixel motion vectors MV (with one-pixel precision) based on the block motion vectors MV0, and an output unit 250 for output of the motion vectors MV. The function of the motion vector densifier 230 is identical to the function of the motion vector densifier 130 in the first embodiment.
The motion estimator 220 divides the frame of interest Fb into multiple blocks (of, for example, 8×8 pixels or 16×16 pixels) MB(1), MB(2), MB(3), . . . , as shown in
As the method of detecting the motion vector Mvf or Mvb, the known block matching method can be used as in the first embodiment. With the block matching method, in order to evaluate the degree of correlation between the pair of reference blocks RBf and RBb and the block of interest CB0, an evaluation value based on their similarity or dissimilarity is determined. In this embodiment, a value obtained by adding the similarity between the reference block RBf and the block of interest CB0 to the similarity between the reference block RBb and the block of interest CB0 can be used as the evaluation value, or a value obtained by adding the dissimilarity between the reference block RBf and the block of interest CB0 to the dissimilarity between the reference block RBb and the block of interest CB0 can be used as the evaluation value. To reduce the amount of computation, the reference blocks RBf and RBb are preferably searched for in a restricted range centered on the position corresponding to the position of the block of interest CB0 in the frame.
Frames Fa, Fb, and Fc need not be spaced at equal intervals on the temporal axis. If the spacing is unequal, the reference blocks RBf and RBb are not point-symmetric with respect to the block of interest CB0. It is desirable to define the positions of the reference blocks RBf and RBb on the assumption that the block of interest CB0 moves in a straight line at a constant velocity. However, if frames Fa, Fb, and Fc straddle the timing of a great change in motion, the motion estimation accuracy is very likely to be lowered, so the time intervals ta-tb and tb-tc are preferably short and the difference between them is preferably small.
As described above, the motion vector detection device 30 in the second embodiment uses three frames Fa, Fb, Fc to generate motion vectors MV0 with high estimation accuracy, so the motion vector densifier 330 can generate dense motion vectors MV with higher estimation accuracy than in the first embodiment.
The motion estimator 220 in this embodiment carries out motion estimation based on three frames Fa, Fb, Fc, but alternatively, the configuration may be altered to carry out motion estimation based on four frames or more.
Next, a third embodiment of the invention will be described.
The motion vector detection device 30 has input units 300a and 300b to which temporally distinct first and second frames Fa and Fb are input, respectively, from among a series of frames forming a moving image. The motion vector detection device 30 also has a motion estimator 320 that detects block motion vectors MVA0 and MVB0 from the input first and second frames Fa and Fb, a motion vector densifier 330 that generates pixel motion vectors MV (with one-pixel precision) based on the motion vectors MVA0 and MVB0, and an output unit 350 for external output of these motion vectors MV.
As schematically shown in
As the method of detecting the motion vectors MVA0, MVB0, the known block matching method may be used. For example, when a sum of absolute differences (SAD) representing the dissimilarity of a sub-block pair is used, the motion vector with the least SAD can be detected as the first motion vector MVA0, and the motion vector with the next least SAD can be detected as the second motion vector MVB0.
Like the motion vector densifier 130 in the first embodiment, the motion vector densifier 330 subdivides each of the blocks MB(1), MB(2), . . . , thereby generating first to N-th layers of sub-blocks. On the basis of the block motion vectors MVA0 and MVB0, the motion vector densifier 330 then generates the two motion vectors ranking highest in order of reliability for each sub-block on each of the layers except the N-th layer, which is the final stage, and generates the motion vector MV with the highest reliability on the N-th (final-stage) layer. Here the reliability of a motion vector is determined from the similarity or dissimilarity between the sub-block of interest and the reference sub-block used to detect the motion vector. The higher the similarity of the sub-block pair (in other words, the lower the dissimilarity of the sub-block pair) is, the higher the reliability of the motion vector becomes.
The basic operations of the hierarchical processing sections 3331 to 333N are all the same. The processing in the hierarchical processing sections 3331 to 333N will now be described in detail, using the blocks MB(1), MB(2), . . . processed in the first hierarchical processing section 3331 as 0-th layer sub-blocks SB0(1), SB0(2), . . . .
The candidate vector extractor 342k takes sub-blocks SBk(1), SBk(2), . . . one by one in turn as the sub-block of interest CBk, and extracts a candidate vector CVAk for the sub-block of interest CBk from the set of first-ranking motion vectors MVAk=1 of the sub-blocks SBk=1(1), SBk=1(2), . . . on the higher layer which is at one level higher than the current layer. At the same time, the candidate vector extractor 342k extracts a candidate vector CVBk for the sub-block of interest CBk from the set of second-ranking motion vectors MVBk=1 of the sub-blocks SBk=1(1), SBk=1(2), . . . on the higher layer which is at one level higher than the current layer. The extracted candidate vectors CVAk and CVBk are sent to the evaluator 343k. The method of extracting the candidate vectors CVAk and CVBk is the same as the extraction method used by the candidate vector extractor 142k (
After the candidate vectors CVAk, CVBk are extracted, the evaluator 343k extracts a reference sub-block from the reference frame by using candidate vector CVAk, and calculates an evaluation value Eda based on the similarity or dissimilarity between this reference sub-block and the sub-block of interest CBk. At the same time, the evaluator 343k extracts a reference sub-block from the reference frame by using candidate vector CVBk, and calculates an evaluation value Edb based on the similarity or dissimilarity between this reference sub-block and the sub-block of interest CBk. The method of calculating the evaluation values Eda, Edb is the same as the method of calculating the evaluation value Ed used by the evaluator 143k (
On the basis of the evaluation values Eda, Edb, the motion vector determiner 344k then selects, from the candidate vectors CVAk, CVBk, a first motion vector MVAk with highest reliability and a second motion vector MVBk with next highest reliability. These motion vectors MVAk, MVBk are output via output units 345Ak, 345Bk, respectively, to the next stage. In the last stage, however, the motion vector determiner 344N in the hierarchical processing section 333N selects the motion vector MV with the highest reliability from among the CVAN, CVBN supplied from the preceding stage.
The motion vector corrector 337k in
As set forth above, based on the pairs of two highest-ranking motion vectors MVAk=1, MVBk=1 input from the previous stage, each hierarchical processing section 333k generates motion vectors MVAk, MVBk with higher density and outputs them to the next stage. The hierarchical processing section 333N outputs motion vectors with the highest reliability as the pixel motion vectors MV.
As described above, the motion vector densifier 330 in the third embodiment hierarchically subdivides each of the sub-blocks MB(1), MB(2), . . . , thereby generating sub-blocks SB1(1), SB1(2), . . . , SB2(1), SB2(2), . . . , SBN(1), SBN(2), . . . on multiple layers, and generates motion vectors MVA1, MVB1, MVA2, MVB2, . . . , MVAN−1, MVBN−1, MV in stages, gradually increasing the density of the motion vectors as it advances to higher layers in the hierarchy. Accordingly, it is possible to generate dense motion vectors MV that are less affected by noise and periodic spatial patterns occurring in the image.
The motion vectors MVA1, MVB1, MVA2, MVB2, . . . , MVAN−1, MVBN−1, MV determined on the multiple layers are corrected by the motion vector correctors 3371 to 337N, so in each stage, it is possible to prevent erroneous motion vectors from being transferred to the next stage. Accordingly, dense motion vectors (pixel motion vectors) MV with high estimation accuracy can be generated from the block motion vectors MV0.
In addition, as described above, the motion estimator 320 detects the two highest-ranking motion vectors MVA0, MVB0 for each of the blocks MB(1), MB(2), . . . , and each hierarchical processing section 333k (k=1 to N−1) in the motion vector densifier 330 also generates the two highest-ranking motion vectors MVAk, MVBk for each of the sub-blocks SBk(1), SBk(2), . . . . This enables the motion vector determiner 344k in
As shown in
The motion estimator 320 and hierarchical processing section 333k (k=1 to N=1) each generate two highest-ranking motion vectors, but this is not a limitation. The motion estimator 320 and hierarchical processing section 333k may each generate three or more motion vectors ranking highest in order of reliability.
The motion estimator 320 in this embodiment detects block motion vectors MVA0, MVB0 based on two frames Fa, Fb, but alternatively, like the motion estimator 220 in the second embodiment, it may detect motion vectors MVA0, MVB0 based on three or more frames.
Next, a fourth embodiment of the invention will be described.
The motion vector detection device 40 has input units 400a, 400b to which temporally distinct first and second frames Fa, Fb among a series of frames forming a moving image are input, respectively, and a motion estimator 420 that detects block motion vectors MVA0, MVB0 from the input first and second frames Fa, Fb. The motion estimator 420 has the same function as the motion estimator 320 in the third embodiment.
The motion vector detection device 40 also has a motion vector densifier 430A for generating pixel motion vectors MVa (with one-pixel precision) based on the motion vectors MVA0 of highest reliability, a motion vector densifier 430B for generating pixel motion vectors MVb based on the motion vectors MVB0 of next highest reliability, a motion vector selector 440 for selecting one of these candidate vectors MVa, MVb as a motion vector MV, and an output unit 450 for external output of motion vector MV.
Like the motion vector densifier 130 in the first embodiment, the motion vector densifier 430A has the function of hierarchically subdividing each of the blocks MB(1), MB(2), . . . derived from the frame of interest Fb, thereby generating first to N-th layers of multiple sub-blocks, and generating a motion vector for each sub-block on each layer based on block motion vectors MVA0. The other motion vector densifier (sub motion vector densifier) 430B, also like the motion vector densifier 130 in the first embodiment, has the function of hierarchically subdividing each of the blocks MB(1), MB(2), . . . derived from the frame of interest Fb, thereby generating first to N-th layers of multiple sub-blocks, and generating a motion vector for each sub-block on each layer based on the block motion vectors MVB0.
The motion vector selector 440 selects one of the candidate vectors MVa, MVb as the motion vector MV, and externally outputs the motion vector MV via the output unit 450. For example, the one of the candidate vectors MVa, MVb that has the higher reliability, based on the similarity or dissimilarity between the reference sub-block and the sub-block of interest, may be selected, although this is not a limitation.
As described above, the motion vector detection device 40 in the fourth embodiment detects the two highest-ranking motion vectors MVA0, MVB0 for each of the blocks MB(1), MB(2), . . . and generates two dense candidate vectors MVa, MVb, so it can output whichever of the candidate vectors MVa, MVb has the higher reliability as motion vector MV. As in the third embodiment, it is possible to prevent the loss of information on motion in multiple directions that may be present in each of the blocks MB(1), MB(2), . . . . Accordingly, the motion vector estimation accuracy can be further improved, as compared with the first embodiment.
The motion estimator 420 generates two highest-ranking motion vectors MVA0, MVB0, but this is not a limitation. The motion estimator 420 may generate M motion vectors or more (M being an integer equal to or greater than 3) ranking highest in order of reliability. In this case, it is only necessary to incorporate M motion vector densifiers for generating M densified candidate vectors from M motion vectors.
Next a fifth embodiment of the invention will be described.
As shown in
After that, the candidate vector extractor 172a in the candidate vector extractor 172k detects the relative position of the sub-block of interest CBk with respect to the sub-block SBk=1(i) on the higher layer which is at one level higher than the current layer (step S13A). For example, in the example in
Next, the candidate vector extractor 142k selects a group of sub-blocks in the area surrounding the parent sub-block SBk=1(i) on the (k=1)-th layer by using the relative position detected in step S13A (step S14M), and places the motion vectors of the sub-blocks in this group in the candidate vector set Vk(j) (step S15). For example, in the example in
After step S15, the candidate vector extractor 172k determines whether or not the sub-block number j has reached the total number Nk of sub-blocks belonging to the k-th layer (step S16); if the sub-block number j has not reached the total number Nk (No in step S16), the sub-block number j is incremented by 1 (step S17), and the process returns to step S11. When the sub-block number j reaches the total number Nk (Yes in step S16), the candidate vector extraction process ends.
As described above, the candidate vector extractor 172k can use the detection result from the candidate vector extractor 172a to select, from among the sub-blocks located in the surrounding area of the parent SBk=1(i) of the sub-block of interest CBk, a sub-block that, spatially, is relatively near the sub-block of interest CBk (step S14M). Accordingly, compared with the candidate vector extraction process (
The structure of the motion vector densifier 160 in this embodiment is applicable to the motion vector densifiers 230, 330, 430A, and 430B in the second, third, and fourth embodiments.
Next, a sixth embodiment of the invention will be described.
As shown in
The frame buffer 11 outputs a video signal 14 representing a series of frames forming a moving image to the motion vector detection device 60 two or three frames at a time. The motion vector detection device 60 generates pixel motion vectors MV (with one-pixel precision) based on the video signal 14 read and input from the frame buffer 11, and outputs them to the interpolator 12.
The interpolator 12 is operable to use the data 15 of temporally consecutive frames read from the frame buffer 11 to generate interpolated frames between these frames (by either interpolation or extrapolation) based on dense motion vectors MV. An interpolated video signal 16 including the interpolated frames is externally output via the output unit 3.
The position of interpolated pixel Pi corresponds to the position of pixel Pk on frame Fk as moved by motion vector MVi=(Vxi, Vyi). The following equations are true for the X component and Y component of motion vector MVi.
Vxi=Vx·(1−Δt2/ΔT)
Vyi=Vy·(1−Δt2/ΔT)
In the above, ΔT=Δt1+Δt2. The pixel value of the interpolated pixel Pi may be the pixel value of pixel Pk on the frame Fk.
The interpolation method is not limited to the linear interpolation method; other interpolation methods suitable to pixel motion may be used.
As described above, the frame interpolation device 1 in the sixth embodiment can perform frame interpolation by using the dense motion vectors MV with high estimation accuracy generated in the motion vector detection device 60, so image disturbances, such as block noise in the boundary parts of an object occurring in an interpolated frame, can be restricted and interpolated frames of higher image quality can be generated.
In order to generate an interpolated frame Fi with higher resolution, the frame buffer 11 may be operable to convert the resolution of each of the frames included in the input video signal 13 to higher resolution. This enables the frame interpolation device 1 to output a video signal 16 of high image quality with a high frame rate and high resolution.
All or part of the functions of the motion vector detection device 60 and interpolator 12 may be realized by hardware structures, or by computer programs executed by a microprocessor.
The frame buffer 11 in
Embodiments of the invention have been described above with reference to the drawings, but these are examples illustrating the invention, and other various embodiments can also be employed. For example, in the final output in the first to fifth embodiments, all motion vectors have one-pixel precision, but this is not a limitation. The structure of each of the embodiments may be altered to generate motion vectors MV with non-integer pixel precision, such as half-pixel precision, quarter-pixel precision, or 1.5-pixel precision.
In the motion vector densifier 130 in the first embodiment, as shown in
There are no particular limitations on the method of assigning sub-block numbers j to the sub-blocks SBk(j); any assignment method may be used.
Number | Date | Country | Kind |
---|---|---|---|
2010-256818 | Nov 2010 | JP | national |
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/JP2011/073188 | 10/7/2011 | WO | 00 | 5/1/2013 |