This invention is directed to picture building in motion compensated video processing.
Many contemporary standards conversion and other video processing systems employ motion compensation in order to Improve the quality of the output pictures. In such systems, it is a typical requirement for new output pictures to be interpolated from original input pictures. Motion compensation assigns motion vectors to the pixels of the input pictures, and these vectors are used to project the original pixels to “build” the output picture.
It is an object of the present invention to provide techniques for improving the quality of the output pictures of such systems.
Accordingly, the invention consists in one aspect in a method of motion compensated combination of two pictures of an input picture sequence to form an output picture at a temporal location between the two Input pictures, comprising: projecting input pixels from the input pictures to locations on the output picture using motion vectors assigned to those input pixels; counting the number of vectors from each input picture which point to a given pixel location on the output picture; and employing this count in controlling the mix of the pixels projected by those vectors used to produce the output pixel at the given pixel location.
The inventors have thus recognized that counting the number of vector “hits” at a particular output pixel location gives important information relating to the quality of the eventual output of the motion compensation process. Using this count to control the process therefore results in significant advances in quality.
Preferably, the method comprises employing a non-linear function of the count in controlling said mix.
In one form of the invention, the method comprises, where a plurality of vectors from one of the input pictures point to the given pixel location, assigning lower weight to the respective pixels of those vectors from that input picture for construction of the pixel at the given location. In another form, the method uses an average of the respective pixels of those vectors as the contribution to the output pixel from that input picture.
In still another form, the method comprises, where a plurality of vectors point to the given pixel location, taking a median of the vectors, and using the vector closest to the median for construction of the output pixel.
In another aspect, the invention provides a method of motion compensated combination of two pictures of an input picture sequence to form an output picture at a temporal location between the two input pictures, comprising: projecting input pixels from the input pictures to locations on the to output picture using motion vectors assigned to those input pixels; and mixing the respective pixels projected by the vectors onto the output picture to produce an output pixel at a given location, wherein, where a plurality of vectors from one of the input pictures project onto said given pixel location, giving increased weighting in controlling the mix to the respective pixels of vectors forming substantially conjugate pairs.
The invention will now be described by way of example with reference to the accompanying drawings, in which:
FIGS. 1 to 3 are diagrams illustrating the function of picture building in a typical motion compensated system; and
In motion compensated standards conversion, the process of picture building is typically important, the accuracy of the process greatly affecting the quality of the output images or pictures. The input pictures are typically in the form of video fields or frames, though of course, any type of input picture sequence may be employed in the embodiments described. Motion compensated picture building techniques are known to the art, and therefore the basic principles will not be discussed in detail here, though some description of the problems commonly arising follows.
In a picture building procedure, as illustrated in
In order to derive information illustrating the motion occurring between input images of an image sequence, a motion measurement process (of which the phase correlation technique is preferred) is performed on the input images. The resulting motion vectors are assigned to pixels or groups of pixels in the input image.
In the case illustrated in
The above example, however, is merely a simple case where a single vector from each frame may be mapped to the required point. In other cases, there may not be a single vector, or there may be multiple vectors pointing to the output pixel position.
In other cases there may not be a vector pointing to the output point from either side, in which case there is simply a hole in the output frame.
A prior method of picture building, as disclosed in EP 0,648,398, handles such situations in the following manner. If there is a single vector his from one frame at the output pixel, the resulting projection of the pixel from that frame is assigned a weighting value of 1. If there is a double hit, each vector is given a weighting of 1, giving an overall weighting for that frame or “side” (of the output position) of 2. Greater numbers of hits increase the total weighting thus. However, if there is no vector hit, the “confidence” in that fame is taken as zero; this therefore prevents the eventual mix of the output pixel taking any information from that frame or side which gave a zero hit result.
The inventors have recognized that a more sophisticated treatment of picture building which measures where multiple and zero hits occur can bring significant benefit over this prior technique in the quality of the output pictures.
In embodiments, the invention provides a system which identifies the occurrence of such “non-single hits” in the picture building process. The techniques described in the following apply the resulting counts to new methods of picture building which give the previously unexpected result of greatly increasing output picture quality.
In one embodiment, if there is any number of hits, from either of the input frames, which is not equal to one, the input from that frame is simply ignored. Thus in the case illustrated in
This method may also be implemented in a “softer” version. For example, where a multiple hit occurs, the system may nevertheless include some proportion, say 10%, of the offending vectors' source pixels in constructing the output pixel. This would be of particular use in cases where there are no hits on one side, and multiple hits on the other; at least some of the pixels from those vectors which would otherwise be ignored may be used for the output pixel.
In most cases, the system will employ some sort of “fallback” mode, in order to prevent failure, or allow a “hole” to appear in the output frame where there are no hits from either side.
An example of a process performed by stages 406 and 416 will now be described briefly with reference to
Returning now to
The output from mixer 410 is passed to a first input of a further mixer 422. The second input to mixer 422 is a “fall back frame” which is provided by stage 424, which selects the input frame which is temporally closest to the output frame. Mixer 422 is controlled by controller 426 which, similar to comparison stage 420, receives the two prediction of quality signals for the respective forward and backward projected candidate frames. Controller 426 selects the greater of the two input signals which provides an overall prediction of quality for the output of mixer 410. This overall prediction of quality signal is used to control the proportions of input signals which are mixed at mixer 422 to produce the output 424.
Thus the previous frame is forward projected, and the following frame back projected to an intermediate temporal location, and the projections are mixed in dependence upon measurements of the number of hits arising on either side. Separate “predictions of quality”, dependent upon hit count, are derived for the previous and following frames, and these are compared to control the projection mix. For example, if a single hit is registered for a given pixel, the PoQ is high, whereas if a zero or multiple hit are registered, the PoQ is low.
In an alternative embodiment, the median of all vectors pointing to a given pixel on the output frame is taken. A number of options are then available: the closest vector to the median is taken, and the other vectors rejected; in a case where there is simply a double hit on one side, the offending vector is rejected as an outlier, as the other two vectors are closer to the median; fractions of the various vectors are taken, according to their proximity to the median. These approaches may be effective in cases where a plurality of spurious vectors produce the multiple hits.
In a further embodiment, the confidence assigned to the vector hits on one “side” of the output frame position is normalised. Thus if there is a double hit on one side, the contribution to the mix may be ¼ of each pixel in the double hit, and ½ of the pixel on the other side.
In a still further embodiment, where there are multiple hits, the vector on the “multiple hit side” are compared with those on the other side. If one vector is the conjugate (or near conjugate) of one of the vectors on the other side, as in
In the embodiments described above, hit counts are generally described as integer values. In alternatives, if a phase correlation process is implemented to sub-pixel accuracy, then a more sophisticated approach is possible. The hit count becomes an accumulation over an area of non-integer hit values, rather than a simple count of vectors pointing to an integer value. Such “soft” hit counts may be processed as in any of the preceding methods in order to provide an output pixel.
In general, certain fallback options are required where zero hits or spurious vectors occur. For example, if vectors on either side produce an inequality or disagreement, the system may take the vector from the closest frame to the output temporal position. Where the hit count is zero on both sides, “holes” occur in the output frame. In such cases, “hole filling” or copying of pixels from either frame may be implemented. In other cases, the system may use the fallback picture, as in
In the above description of certain embodiments of the invention, the example of the projection of two input pictures onto an output picture location is used. It should be noted that aspects of the invention are equally applicable to techniques in which more than two input pictures, and their respective pixels and assigned vectors, are used to create the output picture. Here, notwithstanding the methods described for weighting pixels in particular ways, the proportions of pixels used in the final mix may depend to a greater extent upon the distance of the input picture in question from the temporal location of the output picture.
It will be appreciated by those skilled in the art that the invention has been described by way of example only, and that a wide variety of alternative approaches may be adopted. In particular, the various methods described may be used in conjunction, in a variety of advantageous combinations.
Number | Date | Country | Kind |
---|---|---|---|
0221160.5 | Sep 2002 | GB | national |
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/GB03/03961 | 9/12/2003 | WO | 10/25/2005 |