The present invention relates to interpolation methods used in digital video signal processing, in particular in frame rate conversion (FRC) applications.
In video processing products using FRC, an input stream of video frames is received with a certain time sampling rate. The FRC process converts such input stream of “original frames” into an output stream having a different time sampling rate, including “interpolated frames” with intermediate time positions, i.e. falling between time positions of the original frames.
Every pixel of the interpolated frames is typically computed as a combination of pixels from the original frames, following a motion vector or direction of invariance determined by analyzing the input stream. More generally, the combination of pixels can make use of one or more interpolation vectors identified by analyzing the evolution of the scene in the video sequence.
Several combination modes can be provided for the interpolation and selected depending on a context determined when analyzing the input stream. In particular, a “fallback mode” for combining the original pixels is sometimes available in the FRC process, which is different from the general operation mode. The fallback mode discards most of the information about the scene (such as object motions, for instance) and performs the interpolation by simply blending pixel values of original frames at the same location as the computed pixel. This amounts to doing an interpolation along a zero motion.
The fallback mode is often used on scenes or parts of scenes which are too complex for the general operation mode to interpolate correctly. In other words, it is used where the FRC engine decides that it is better to leave motion blur or judder instead of introducing unnatural artifacts that will catch the viewer's eye.
Switching between the general operation mode and the fallback mode is very noticeable because it often affects large portions of the image. Although the actual difference between the results of the two modes may be small, the fact of switching can be easily perceived by the final viewer as a sudden jump in the output video sequence. If such a switching happens often and/or for large portions of the image, it can be easily pointed out by the final user as an important artifact of the video processing product using the FRC process.
There is thus a need for interpolation methods with a reduced impact of jumping artifacts.
A video interpolation method is proposed, comprising: analyzing evolution of a scene represented in a video sequence of input frames; and computing output pixels of an output frame having a time position intermediate between time positions of the input frames, by combining respective input pixels of the input frames. At least three interpolation modes are provided for computing the output pixels. The interpolation modes include:
The second mode is a general operation mode based on motion detection or another kind of scene evolution analysis performed in the analysis step, while the first mode is a fallback mode used when the results of the scene analysis are not considered reliable. The third mode uses intermediate interpolation vectors to provide smooth transitions in intermediate situations in a way which corresponds naturally to the properties of the human vision. It is available to avoid jumping artifacts when switching between the general mode and the fallback mode.
In the third mode, the amplitude of the interpolation vectors is gradually changed, which is a much better solution than a simple linear blending between pixels values respectively obtained by the general operation mode and the fallback mode. Human vision acts in such a way that as soon as the fallback mode solution becomes “visible” (i.e. has an interpolation weight substantially different from zero in a linear blending method), it is perceived as a separate scene despite its relative transparency. Thus, blending pixels values alters the smooth transition effect and fails to provide the expected visual result.
The third interpolation mode is typically selected for an output pixel position in a transition phase between use of the first mode and use of the second mode or vice versa for computing output pixels at an output pixel position in successive output frames. The third mode can also be used when the reliability of the scene analysis is degraded but not to the point of switching to the fallback mode. Different ways of handling the transition phases can be implemented.
In an embodiment, a transition degree is determined for an output pixel, and the interpolation mode is selected for said output pixel based on the transition degree. In such a case, when the third interpolation mode is selected for the output pixel, the second interpolation vector may be determined as a function of the transition degree.
In particular, the transition degree can be 0 for the first interpolation mode, 1 for the second interpolation mode and a number between 0 and 1 for the third interpolation mode. For example, when the third interpolation mode is selected for the output pixel, each second interpolation vector may be determined as (vx, vy)=(vxD,vyD)+β·[(v0x, v0y)−(vxD,vyD)], where (vxD,vyD) is the default interpolation vector and (v0x, v0y) is a respective first interpolation vector determined for the output pixel when analyzing scene evolution and β denotes the transition degree (0≦P≦1). Alternatively, the second interpolation vector can be (vx, vy)=(vxD,vyD)+Min{1,β·vxM/|v0x|,β·vyM/|v0y|}·[(v0x,v0y)−vxD,vyD)] when the third interpolation mode is selected for the output pixel, where v0x and v0y are coordinates of said first interpolation vector along two directions of the frames, vxM and vyM are preset positive parameters, and Min(a,b,c) represents the smallest of three numbers a, b, c.
In an implementation of the interpolation method, determining a transition degree for an output pixel comprises:
In a typical non exclusive embodiment, the first interpolation vectors are motion vectors determined for the output pixels by analyzing motion in the step of analyzing evolution of the scene represented in the video sequence. A possibility for the default interpolation vector used in the first and third modes is to take it as a zero motion vector.
Another aspect of the invention relates to a video interpolation apparatus, comprising:
Other features and advantages of the method and apparatus disclosed herein will become apparent from the following description of non-limiting embodiments, with reference to the appended drawings.
The apparatus represented in
Only one value per pixel is considered in the following explanations (red color channel, for instance). However, it will be appreciated that the same mechanism can be applied to all channels.
The choice of the pixels to be combined, as well as the way they are combined, depend on the contents of the closest original frames, the spatial coordinates (x, y) of the interpolated pixel, the time position of the interpolated frame (τ), and possibly other factors (such as global cadence phase, general information about the scene type, etc.).
In the following, the notation I(x, y, t+τ) represents the value of a pixel at spatial coordinates (x, y) and at a time position t+t with t integer and 0≦τ<1. It can represent a value of an original pixel from an input frame (τ=0), or a value of an interpolated pixel of an output frame otherwise (0<τ<1).
An output pixel of an interpolated frame is computed by a combination such as:
I(x,y,t+τ)=Fx,y,t,τ[I(x1,y1,t1),I(x2,y2,t2), . . . ,I(xn,yn,tn)] (1)
where (xi, yi) denote the spatial coordinates of a pixel of an input frame at time ti which contributes to the combination, and Fx,y,t,τ is an interpolation function having n pixel values as arguments. The function Fx,y,t,τ can be different for every interpolated output pixel. Both spatial and temporal coordinates of the contributing pixels (xi, yi, ti) can be different for every interpolated pixel as well.
The interpolation function Fx,y,t,τ and the space and time coordinates of the contributing pixels are referred to as an “interpolation set” {Fx,y,t,τ, x1, y1, t1, . . . , xn, yn, tn}. A respective interpolation set is determined for each pixel of the interpolated frame.
The interpolation function Fx,y,t,τ is often a simple linear combination, i.e. a weighted sum of its arguments:
I(x,y,t+τ)=p1(x,y,t,τ)·I(x1,y1,t1)+ . . . +pn(x,y,t,τ)·I(xn,yn,tn) (2)
using n weights pi(x, y, t, τ) such that
For example, in directional interpolation, the linear blending is performed along a single direction which corresponds to a motion vector (vx, vy) detected at the current point (x, y, t+τ). The interpolated pixel value can be a weighted sum of only two pixels, the first one taken from the adjacent original frame in the past, and the second one from the adjacent original frame in the future. In terms of equation (2), n=2, t1=t, t2=t+1. So, equation (2) may be reduced to:
I(x,y,t+τ)=Fd[x,y,t,τ,vx,vy]=p1(x,y,t,τ)·I(x−vx·τ,vy·τ,t)+p2(x,y,t,τ)·I(x+vx·(1−τ),y+vy·(1−τ),t+1) (3)
In their simplest form, the relative weights p1(x, y, t, τ), p2(x, y, t, τ) of the two pixels are given by the temporal distance between the interpolated frame and the corresponding original frames, i.e. p1(x, y, t, τ)=1−τ and p2(x, y, t, τ)=τ. In order to take into account occlusion information, it can be useful to apply different relative weights, e.g. (p1, p2)=(1, 0) or (0, 1), in an occluded or disoccluded zone depending on object motion (see, for example, international patent application No. PCT/EP2010/050744).
In certain applications, directional interpolations along a number q>1 of motion vectors (vx,1, vy,1), . . . , (vx,q, vy,q) can be blended at the same point:
I(x,y,t+τ)=F1d[x,y,τ,vx,1,vy,1]+ . . . +Fqd[x,y,vx,q,vy,q] (4)
The determination of one or more interpolation vectors, and possibly of some other information, for each interpolated output pixel involves analysis of the evolution of the scene represented in the input video sequence. Such analysis makes it possible to determine an interpolation set {Fx,y,t,τ, x1, y1, t1, . . . , xn, yn, tn} for each output pixel (x, y, t+τ) of an interpolated frame.
In certain instances, the determined interpolation set {Fx,y,t,τ, x1, t1, . . . , xn, yn, tn} may be simply expressed as one or more motion vectors (v0x, v0y) representing the local displacement of an object to which the pixel (x, y, t+τ) is expected to belong in the scene. Different known means can be applied to determine such motion vectors (see, among others, WO 2009/087493 or the above-mentioned application No. PCT/EP2010/050744). The present interpolation method is not dependent on the details of the specific scene evolution analysis scheme which is applied.
In the embodiment illustrated in
For example, the analyzer 10 may perform motion detection using the well-known block matching technique. Detecting the motion at (x, y, t+τ) then consists in minimizing a matching energy Ex,y,t+τ(vx, vy) over a window W which is a set of offsets d=(dx, dy). A possible form of the matching energy is the Euclidean distance
Then, we can take
As the detected motion vector at (x, y, t+τ), where the minimization is over a predefined set of candidate vectors Ω. If the confidence data α is expressed as a scalar value, it may be computed as a function of Ex,y,t+τ(v0x, v0y), for example normalized with respect to the signal energy in the local window, i.e. α being proportional to
The confidence data α may be determined by a thresholding operation based on Ex,y,t+τ(v0x, v0y) or λ, for example α=0 (low confidence) if the threshold is exceeded and α=1 (high confidence) if Ex,y,t+τ(v0x, v0y) or λ is below the threshold. The confidence data α may further take other discrete values between 0 and 1 if multiple thresholds are used, the higher values of α meaning higher confidence levels in the motion detection.
In the embodiment illustrated in
Based on the transition degree β, a correction may be applied to the interpolation vector (v0x, v0y) detected for the current output pixel by a smooth fallback correction module 12. After the correction, if any, an interpolation vector provided to the interpolator 13 is noted (vx, vy). From this vector, the interpolator 13 performs directional interpolation, for example according to (3), to obtain an interpolated value for the output pixel (x, y, t+τ). If more than one interpolation vector are provided, multidirectional interpolation according to (4) may be applied, or a more general interpolation scheme according to (1) or (2).
In the general operation mode illustrated in
Another interpolation mode of the FRC apparatus is the fallback mode, of which an embodiment is illustrated in
The default interpolation vector may be chosen by different ways. A possibility is to obtain it using, locally or globally, another scheme for analyzing scene evolution. In an embodiment considered more particularly in the following, it is simply taken as a zero motion vector, namely (vxD,vyD)=(0, 0). In this case, the interpolated pixel 2 combines only the values of the pixels from the temporally close input frames which have the same spatial coordinates (x, y) as the interpolated pixel itself. Those input pixels are aligned on a fallback direction line L which goes through the current output pixel 2.
In this case, the general interpolation equation (1) is modified as follows:
I(x,y,t+τ)=Fx,y,t,τ[I(x,y,t1),I(x,y,t2), . . . , I(x,y,tn)] (5)
The choice of the interpolation mode can be made independently for each interpolated pixel. However, quite often, large portions of the image or even whole frames change the interpolation mode simultaneously.
The original pixelwise choice of the interpolation sets, performed by the scene evolution analyzer 10, may also be changed by the module 12 in another interpolation mode referred to as “smooth fallback” mode. In the illustrated embodiment, the smooth fallback mode corresponds to 0<β<1. Each vector (v0x, v0y) determined for an output pixel by the scene evolution analyzer 10 can then be replaced by a smoothed interpolation vector (vx, vy), for example (vx, vy)=(vxD,vyD)+β·[(v0x, v0y)−(vxD,vyD)], which simply reduces the amplitude of the interpolation vector in proportion to β when (vxD,vyD)=(0, 0), i.e. (vx, vy)=β·(v0x, v0y).
The smooth fallback controller 11 provides smooth fallback control data to the smooth fallback correction module 12. These data define what transition degree should be applied to the given image, portion of image, or individual pixels, and possibly certain classes of the interpolation sets. For instance, one may decide to apply the smooth fallback correction only to the interpolation sets corresponding to the objects having a film frame repetition cadence, but not to those having a video frame repetition cadence. Alternatively, such correction may be applied only to the interpolation sets which are considered doubtful for some reason. Information about such doubtful interpolation sets may be provided directly by the scene evolution analyzer 10 or may be computed as a function of properties of the interpolation set by the smooth fallback controller 11.
For a given transition degree β, the original interpolation set may be modified as follows:
In this case, the general interpolation equation (1) is modified as follows:
I(x,y,t+τ)=Fx,y,t,τ[I(x+β1·(x1−x),y+β1·(y1−y),t1), . . . ,I(x+βn·(xn−x),y+βn·(yn−y),tn)] (6)
for coefficients β1, β2, . . . βn lying between 0 and 1.
In the simple case of directional interpolation with one motion vector (v0x, v0y), relative weights given by the temporal distance between the interpolated frame and the corresponding original frames and (vxD,vyD)=(0, 0), equation (6) for the smooth fallback mode is reduced to:
I(x,y,t+τ)=(1−τ)·I(x−β1·v0x·τ,y−β1·v0y·τ,t)+τ·I(x+β2·v0x·(1−τ),y+β2·v0y·(1−τ),t+1) (7)
In other words, the originally detected motion vector (v0x, v0y) has its amplitude reduced to be replaced by (vx, vy)=β1·(v0x, v0y) for frame t and (vx, vy)=β2·(v0x, v0y) for frame t+1. The coefficients β1, β2 for the same motion vector (v0x, v0y) are typically equal.
The coefficients βi which control the transition level can be different for every contributing pixel, as defined by the smooth fallback control data. In an embodiment, these coefficients are all equal to the transition degree β determined by the smooth fallback controller 11: β1= . . . =βn=β. For β1= . . . =βn=1, the general interpolation mode is applied. For β1= . . . =βn=0, the fallback interpolation mode is applied. Intermediate βi values correspond to the smooth fallback mode.
When the interpolation mode is to be changed from general (
The duration of the transition, as well as the time profiles of the transition functions βi(t+τ) can be configured by the fallback controller 11. For example, the degrees of transition βi can be the same for all the contributing pixels 6, 7, as shown in
The coefficients βi(t+τ) can also depend on the magnitude of the interpolation vectors (v0x, v0y) originally determined for the output pixel by the scene evolution analyzer 10. For instance, the vectors can be clamped gradually, starting from the largest ones. In this case, at the end all interpolation vectors converge to the fallback solution with the same absolute speed.
βi=Min{1,β·R0/∥(v0x,v0y)∥} (8)
where R0 is a preset clamping radius and ∥(v0x,v0y)∥ some norm for the interpolation vector, e.g. ∥(v0x,v0y)∥=max{v0x,v0y} or √{square root over (v0x2+v0y2)}.
Reference can also be made to a rectangle 20 whose maximum dimensions along the spatial directions x and y are equal to 2·vxM and 2·vyD, respectively, as shown in
βi=Min{1,β·vxM/|v0x|,β·vyM/|v0y|} (9)
The parameters vyM, vyM may correspond to upper bounds of the coordinates of the candidate vectors of the set Ω as used by the scene evolution analyzer 10.
The interpolation can, for example, be performed as:
(vx,vy)=(vxD,vyD)+Min{1,β·vxM/|v0x|,β·vyM/|v0y|}·[(v0x,v0y)−vxD,vyD)] (9′)
if default interpolation vectors (vxD,vyD) other than zero are available.
The evolution in time of the transition degree β(t) for a given output pixel position can have various forms. Typically, it is a simple monotonous function which satisfies border conditions such as β(tfallback)=0, β(tgeneral
It is also possible to configure the evolution in time of the transition degree β(t) in a way that takes into account other parameters, such as system state history. This allows creating systems with hysteresis and/or systems with different transition reactivity and speed depending on whether the FRC engine switches from general interpolation mode to the fallback mode or vice versa.
This allows to perform visually smooth FRC/fallback transitions, as well as to mask to a certain extent fallback transitions of short duration.
In an embodiment, the transition degree β(t) for a given pixel is obtained by analyzing the time evolution of the confidence data α determined for the pixel in question. For example, β(t) may be determined by low-pass filtering in time, e.g. as an average of α(t−q), . . . , α(t−1), α(t) for some integer q. Hence, when α=1 for a few frames (stable high confidence in the interpolation vectors), β=1 and the general operation mode is selected, while when α=0 for a few frames (lingering low confidence in the interpolation vectors), β=0 and the fallback mode is selected. In intermediate situation (fluctuating confidence), the smooth fallback mode is selected (0<β<1).
The transition degree β may also be filtered spatially for more regularity over the frame.
However, if for some reason the interpolation mode is switched from the general FRC mode to the fallback mode after the first frame, this abrupt change would be immediately noticed, since the visual difference between the two modes of object tracking is important. Direct application of the fallback mode would provide intervening frames at t+⅓ and t+⅔ as shown in
Use of a smooth fallback transition as described above is illustrated in
While a detailed description of exemplary embodiments of the invention has been given above, various alternatives, modifications, and equivalents will be apparent to those skilled in the art. Therefore the above description should not be taken as limiting the scope of the invention which is defined by the appended claims.
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/EP2010/067849 | 11/19/2010 | WO | 00 | 9/13/2012 |
Number | Date | Country | |
---|---|---|---|
61313950 | Mar 2010 | US |