This application is the U.S. national phase of International Application No. PCT/GB2008/001276 filed 11 Apr. 2008, which designated the U.S. and claims priority to EP Application No. 07251908.5 filed 9 May 2007, the entire contents of each of which are hereby incorporated by reference.
The present invention relates to the analysis of video sequences for display on interlaced systems such as those according to BT.601 [1]. In such systems, the use of incorrect field ordering (top-field-first (TFF) or bottom-field-first (BFF) can result in noticeable and objectionable effects in the display of both interlaced and progressive-type sequences.
According to the present invention there is provided a method of detecting field order of a video signal, comprising:
receiving successive digitally coded frames, each frame comprising data for a field of a first type and data for a field of a second type;
generating for each field of the first type:
Other aspects of the invention are set out in the claims.
Some embodiments of the invention will now be described, by way of example, with reference to the accompanying drawings, in which:
Here, a method is presented to detect field order problems in video content by the sliding-window comparison of a target field (top or bottom) with three neighbouring opposite-phase fields. For each analysed field, a number of measures are generated and used to make an “instant” classification of the frame containing the target field. These frame results are buffered and corresponding “windowed” decisions made. The “instant” and “windowed” results are used in a “combined” analysis technique to identify video field-ordering properties and the nature and location of potential field-ordering problems. Furthermore, the technique is enhanced by an “instant”-only method for identifying problem areas of frame-repeated progressive content.
A brief description of interlaced and progressive video properties is given in this section prior to the description of the analysis technique in Section 3.
A top-field-first interlaced video sequence may be represented as a sequence of alternating top and bottom fields as shown in
Temporally, the B fields should be half-way between the neighbouring T fields. Spatially, a B field will be offset vertically down by 1 line on the output display device and interlaced with the corresponding T field, as shown in
For PAL-625 system “I” standard-resolution display (720×576 active samples per frame), a visible picture would consist of TFF interlaced 288-line T and B fields with a field update rate of 50 Hz (frame-rate of 25 Hz). For NTSC-525 system “M” standard resolution display (720×480 active samples per frame), a visible picture would consist of BFF interlaced 240-line T and B fields with a field update rate of 59.94 Hz (frame-rate of 29.97 Hz).
Progressive video sequences have frames made up of lines from just one time-interval, such as generated by cine-film cameras and more recently digital video (DV) cameras. Considering fields presented for interlaced display, progressive sequences would have the temporal characteristics represented in
Frame-repeated progressive video, as illustrated in
There follows a brief consideration of how field-ordering properties of both interlaced and progressive content may cause problems for interlaced display.
The interlaced display of content with incorrect field-ordering can cause noticeable degradations. If BFF interlaced content is displayed TFF then a “juddering” effect may be visible within areas of motion. This problem is illustrated in
One might expect there not to be a problem with field-ordering for progressive content, as a pair of fields derived for interlaced playback from one progressive frame would be from the same time interval. No motion “juddering” should occur for either field-ordering in playback. However, there are a number of video editing and processing operations that can make the interlaced playback of progressive content sensitive to field ordering.
Frame-rate conversion, between cine-film (24 frames/s), PAL (25 frames/s) and NTSC (29.97 frames/s) can involve the addition and removal of fields from the original sequence. The insertion or deletion of a field within a frame causes subsequent frames to consist of fields from different time intervals, making interlaced display field-order sensitive. Field-level editing, such as cutting and fading, can also introduce such field-order sensitivity when applied to progressive content.
Another example of field-misaligned progressive content, as observed in real cartoon content, is illustrated in
Apart from the more persistent motion juddering, such field-level transitions can also cause visible problems at scene cuts. For the example shown in
There follows a description of a technique for the detection of potentially visible field-order problems within video content subjected to interlaced display. The technique operates on either a sequence of fields or frames in a pixel intensity representation and relies on a sliding comparison between fields from 3 successive frames.
The technique consists of two main processing paths handling “instant” (highlighted grey) and “windowed” parameter sets. The apparatus comprises two buffers Dif_buf and IDFlag_buf and ten function units Func1, Func2, etc. These could if desired be constructed as separate program-controlled processors, or if preferred the functions of two or more units could be performed by a single processor. A description of the function of each unit will now follow.
Each input video frame, video(n), may be considered to consist of an interlaced pair of top and bottom fields, T(n) and B(n). For each frame, three difference calculations are performed to indicate the match between the target field of one type, which is either the top or bottom field of the current frame, and the three second type, or opposite phase, fields of the previous, current and next frames. This process is illustrated for a bottom field target in
Using mean of absolute difference between pixel intensity values, the difference calculations may be defined as:
In (3.1) to (3.3), B(n,x,y) represents the pixel intensity value of pixel position x horizontal and y vertical in the bottom field of the n'th frame and X and Y are the total number of horizontal and vertical pixels respectively. T(n,x,y) represents the same pixel position in the corresponding top field.
It will be noted that the difference signals represented by Equations 3.1, 3.2 and 3.3 are differences between two fields which necessarily have a vertical spatial offset from one another which will therefore give rise to non-zero difference values even in the case where there is no motion. We have not found this to be a serious problem in practice, but reliability can be improved by applying one of the known techniques for alleviating this problem. Techniques to reject interlaced spatial artefacts, through methods such as “coring” and “diamond motion analysis” [2 . . . 9], would be of particular benefit in the matching of fields. In another approach, one can disregard individual difference values (i.e. before summation) that fall below a threshold, and/or evaluate the difference between a pixel and the corresponding pixel on both the line above it and the line below it on the other field, taking the smaller of the two.
The results quoted below used both these expedients, Equation 3.1 being replaced by
and similarly for Equations 3.2 and 3.3:
These compensated equations were used in the tests reported below
If a top field is chosen as target then the equations are the same except that “T” and “B” are transposed.
In this specification, references to “previous”, “next”, “following” or the like frames refer to the frames considered in capture-and-display order, even though a particular coding scheme may vary the order of transmission.
The set of 3 difference values, DIF1 . . . 3(n), are then entered into the difference buffer, where the previous K-1 sets of difference values are also stored. Func2 then performs time-averaging of the difference parameters according to equations (3.4) to (3.6).
For each frame, the averaged difference values wDIF1 . . . 3 are ranked in order of magnitude wRank1 . . . wRank3, such that wRank1 is equal to the maximum value and wRank3 the minimum. The variables wIndex1 . . . wIndex3 are set to the corresponding index numbers, so that if wDIF2 was the maximum and wDIF1 the minimum, then wIndex1 would equal 2, wIndex2 would equal 3 and wIndex3 would equal 1. Windowed decision parameters are then calculated according to equations (3.7) to (3.9)
wMotionFlag(n)=((wRank1(n)−wRank3(n)>MThresh1) AND(wRank1(n)>MThresh2)) (3.7)
wParam1(n)=100.0*(wRank2(n)−wRank3(n))/(wRank1(n)−wRank3(n)) (3.8)
wParam2(n)=100.0*wRank3(n)/wRank2(n) (3.9)
Possible settings for MThresh1 and MThresh2 are given in Table 3.15.
The motion parameter, wMotionFlag, indicates that there is sufficient motion for a meaningful field-ordering decision to be made. This parameter depends on comparisons of absolute and relative DIF values with appropriate thresholds according to equation (3.7). Parameter, wParam1, represents the difference between the 2nd and 3rd ranked DIF values as a percentage of the range of the DIF values and is intended to reflect the feature of two DIF values being nearly the same and significantly less than the remaining value. This parameter is based purely on differences and would be unaffected by an offset applied equally to all 3 difference values. Parameter wParam2 represents the ratio of the minimum to the middle DIF value as a percentage and is intended to reflect the overall significance of the difference between the two smallest DIF values.
A decision on interlaced field-order properties for the n'th frame may be made by testing WParam1 and WParam2 against thresholds according to equations (3.10) and (3.11).
wIFlag1(n)=((wParam1(n)<iThresh1)AND(wParam2(n)>iThresh2) AND(wMotionFlag(n))) (3.10)
wIFlag2(n)=wIndex1 (3.11)
Suitable threshold values are given in Table 3.15 and the resulting flags may be interpreted according to Table 3.1.
A decision on progressive field-order properties for the n'th frame may be made by testing wParam1 and wParam2 against thresholds according to equations (3.12) and (3.13).
wPFlag1(n)=((wParam1(n)>pThresh1)AND (wParam2(n)<pThresh2)AND(wMotionFlag(n)) (3.12)
wPFlag2(n)=wIndex3 (3.13)
Decision threshold pThresh1 is set according to Table 3.2 and defines acceptance bands for the wParam1 parameter. Suitable threshold values are given in Table 3.15.
The resulting flags may be interpreted according to Table 3.3.
The thresholds set for the interlaced and progressive decisions ensure that I and P “True” decisions are mutually exclusive.
The intermediate decisions flags are then combined according to Table 3.4 to give a single “windowed” decision flag, wDFlag. Note that here, and in other Tables that serve to define the algorithm, the resultant of the operation is shown in the right-hand column.
Tables that are merely explanatory are labelled “Explanation of . . . ”.
Func3a uses the ranking process described in Section 3.3 to produce “instant” ranked parameters, iRank1 . . . 3 and iIndex1 . . . 3, from difference values, DIF1 (n) . . . DIF3(n). The resulting “instant” parameter values, iParam1 and iParam2, are calculated according to equations (3.14) and (3.16).
iMotionFlag(n)=((iRank1(n)−iRank3(n)>MThresh1) AND(iRank1(n)>MThresh2)) (3.14)
iParam1(n)=100.0*(iRank2(n)−iRank3(n))/(iRank1(n)−iRank3(n)) (3.14)
iParam2(n)=100.0*iRank3(n)/iRank2(n) (3.16)
A decision on interlaced field-order properties for frame n may be made by testing iParam1 and iParam2 against thresholds according to equations (3.17) and (3.18).
iIFlag1(n)=((iParam1(n)<iThresh1)AND(iParam2(n)>iThresh2) AND(iMotionFlag(n))) (3.17)
iIFlag2(n)=iIndex1 (3.18)
The resulting flags may be interpreted according to Table 3.5.
A decision on progressive field-order properties for frame n may be made by testing iParam1 and iParam2 against thresholds according to equations (3.19) and (3.20).
iPFlag1(n)=((iParam1(n)>pThresh1)AND (iParam2(n)<pThresh2)AND(iMotionFlag(n)) (3.19)
iPFlag2(n)=iIndex3 (3.20)
Decision threshold pThresh1 is set according to Table 3.2 and defines acceptance bands for the iParam1 parameter. The resulting flags may be interpreted according to Table 3.6.
The thresholds set for the interlaced and progressive decisions ensure that I and P “True” decisions are mutually exclusive and example values are given in Table 3.15 below.
The “instant” intermediate decisions flags are combined according to Table 3.7 to give a single “instant” decision flag, iDFlag.
The instant decision flag for the n'th frame, iDFlag(n), is entered into the “Instant Flag Buffer” iDFlag_buf, where it is stored with the previous K-1 instant flag values according to equations (3.21) and (3.22).
iDFlag_buf(k)=iDFlag_buf(k-1) k=K . . . 2 (3.21)
iDFlag_buf(1)=iDFlag(n) (3.22)
The K most recent instant flag values stored in the “instant flag buffer” iDFlag_buf are processed in Func5bto produce a tally flag, tDFlag. Firstly, the flag buffer is analysed to find how many times each possible iDFlag value (0 . . . 6) occurs in the instant buffer, iDFlag_buf. This is performed according to equation (3.23).
The operator f1 is defined in equation (3.24)
f1(k,i)=1 if (k=i) f1(k,i)=0 else (3.24)
A set of seven tally counts is produced, as shown in Table 3.8.
Then, the six instant tally counts corresponding to “known” states are analysed to find the flag value, tallyMax, whose tally value is both the maximum of the six and greater than a decision threshold, TyThresh1. This process is described by Table 3.9.
The instant tally flag, tDFlag, is then set according to Table 3.10. This flag will only return an index to a “known” field ordering condition (0≦iDFlag≦5) when the corresponding tally value is greater than the decision threshold, TyThresh1, and there are no other “known” or “error” conditions present in the buffer.
Func5balso incorporates “instant intermittent” analysis, a procedure for detecting intermittent field misalignment in progressive content. Firstly, the progressive state that will potentially give visible errors, TestFlag, is determined according to Table 3.11, where the IDFO (Interlaced Display Field Ordering) flag indicates whether the interlaced display is TFF or BFF. The IDFO flag indicates the supposed or alleged temporal relationship of the fields may be derived from the incoming signal, or may be input manually by the user.
If IDFO signals TFF, then BFF misaligned progressive content might produce visible motion distortions and TestFlag is set to 0 (BFF). If IDFO signals BFF, then TestFlag is set to 2 (TFF). TestFlag is then used in a test of whether a significant number of potentially visible field misalignments have been detected in the instant flag buffer. The resulting instant output flag, iOFlag, is set according to Table 3.12. See Table 3.15 for threshold values.
For a given buffer of frames, the “instant tally flag”, tDFlag, reflects a condition where the majority of “instant” decision values have identical “known” conditions and all other decision values have the “unknown” condition. The flag, tDFlag, is combined with the windowed decision flag, wDFlag, according to Table 3.13 to produce the overall decision. flag, wOFlag. This effectively copies the value of wDFlag to wOFlag if the instant tally flag is identical else an “unknown” condition is set.
The output decision, wOFlag, indicates the frame properties (interlaced or progressive) and the field order properties for the buffered segment of content. Further processing is applied to indicate, through the error flag, wEFlag, whether the properties detected might cause visible errors for interlaced display. This function requires the additional input, IDFO (Interlaced Display Field Order) and is defined in Table 3.14. It should be noted that TFF misaligned progressive content output on a TFF interlaced display would not be flagged as an error and neither would BFF misaligned progressive on a BFF display.
3.11 Output Flag Parameters (wOFlag, wEFlag, iOFlag)
The output flags wEFlag and wOFlag reflect the results of the windowed analysis of field order properties. Parameters are averaged over a number of frames and refined by results from frame-by-frame “instant” results over the same period. As such, this analysis is sensitive to repetitive field-ordering conditions and may be considered to be a reliable detector of general content properties.
The output flag iOFlag is based on the analysis of frame-by-frame “instant” properties only and is sensitive to intermittent progressive field misalignments. The events handled by this flag are unlikely to be detected by the windowed analysis and the analysis therefore offers an additional warning for potentially visible errors. However, the nature of the “instant” test makes it more susceptible to misdetection and should be considered as a supplementary warning flag to the more reliable windowed results.
Further statistical analysis of these flags may be necessary to meet the requirements of differing applications.
The choice of decision thresholds will depend on the intended use of this field-order analysis algorithm. Table 3.15 shows a set of thresholds that suitable for the detection of potential field-ordering problems for standard resolution television systems.
There follows a consideration of function and performance of the key aspects of the field-detection algorithm described in section 3. Firstly, consideration is given in Sections 4.1 and 4.2 to the properties of “instant” and “windowed” decision parameters for interlaced and non-frame repeated progressive video. Then in Section 4.3 the benefits of combining “instant” and “windowed” parameters are presented.
Considering
The DIF values for successive frames of a 7 second TFF interlaced sequence were calculated according to Func1 and are shown in
The noisy nature of the parameters may be handled by time-averaging and
Corresponding “instant” and “windowed” decision parameters calculated according to Func3 and Func3aare shown in
Parameter1 is intended to reflect the feature of two DIF values being nearly the same and significantly less than the remaining value and Parameter2 is intended to reflect the overall significance of the difference between the maximum and minimum DIF values. “Instant” and “windowed” decisions of interlaced type for the n'th frame are made by testing Param1 and Param2 against thresholds, (func4) and (func4a). Successful detection of an interlaced type requires Parameter1 to be below iThresh1 and Parameter2 to be above iThresh2. The benefits of “windowing”, particularly for parameter 1, are clear from these graphs.
Considering
The corresponding windowed parameter values, wParam1 and wParam2, for
For progressive content with significant motion between frames, a value of wParam2≈0 clearly distinguishes both correctly and incorrectly aligned content from interlaced content. For correctly aligned progressive content, the value of WParam1 would be expected to be near 100% due to the small separation of the first and second ranked difference values. However, for misaligned progressive content, this feature is not present and an expected range of WParam1>25% might be used to provide additional confidence to the discrimination.
The “windowed” decision parameters for both interlaced and non-frame-repeated progressive video are generally stable, offering reliable identification of field-ordering properties. However, events including scene-cuts, fast fades and fast pans may produce instant DIF values that make the “windowed” parameters temporarily unreliable.
The key feature of this content is the scene cut at frame 35, which causes the distinct peaks in the DIF1 and DIF3 measures. The scene cut is preceded by a period of low and reducing motion from frame 21, which is reflected by all three DIF values in
The low motion of frames 27 to 31 fails to set the “instant” motion flag in Equation 3.14 and is indicated by the zeroing of both “instant” decision parameters in
The advantage of “windowing” is clearly shown in
Table 4.3 also shows that, not only does the scene-cut distort the following “windowed” DIF values, but it also causes the “windowed” decision, wDFlag, to return a misclassification of “1=Correctly aligned progressive” for frames 35 onwards.
The field-order classification method described in Section 3 overcomes the problems of scene cuts, and more generally high motion fades and pans, by combining “instant” and “windowed” techniques. The “instant” parameters are more variable in nature, but far more robust to such conditions than the “windowed” parameters. Table 4.3 shows that the combined decision, wDFlag, rejects the misclassification from the “windowed” analysis, wDFlag.
Twenty minutes of TFF interlaced standard resolution television was analysed and the “windowed” and “combined” classification results are presented in
Progressive content with repeated frames (
Such regular-frame repeating would require the “windowed” analysis described in Section 3 to be modified, as time-averaging the DIF values shown in
Rather than looking to classify all regions of such content, the technique described in Section 3 aims to identify only potentially visible misalignments. This is achieved within Func5bby searching the “instant” flag buffer for potentially problematic patterns. This technique benefits from the high reliability of the “instant” decisions for progressive content.
Ten minutes of standard resolution progressive cartoon content with variable frame-repeating and BFF misalignment was analysed for a target TFF display.
The technique described here has been designed to identify potential field-ordering problems for the interlaced display of progressive and interlaced content. A sliding 3-frame analysis window is used to generate “instant” and “windowed” sets of field difference measures, from which corresponding “instant” and “windowed” decision parameters are derived. These parameters are designed to reflect key identifying features of the field-ordering of progressive and interlaced video sequences. A combination of “windowed” and “instant” analysis has been shown to offer a reliable classification technique able to handle both interlaced and progressive sequences. The “combined” analysis makes the technique particularly robust to misclassification at scene-cuts, high motion pans and fades or with frame-repeated progressive content. Furthermore, the identification of potential problems in frame-repeated progressive content is enhanced by the inclusion of an “instant”-only test for irregular or intermittent problems.
Number | Date | Country | Kind |
---|---|---|---|
07251908 | May 2007 | EP | regional |
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/GB2008/001276 | 4/11/2008 | WO | 00 | 11/2/2009 |
Publishing Document | Publishing Date | Country | Kind |
---|---|---|---|
WO2008/139135 | 11/20/2008 | WO | A |
Number | Name | Date | Kind |
---|---|---|---|
5291280 | Faroudja et al. | Mar 1994 | A |
5317398 | Casavant et al. | May 1994 | A |
5398071 | Gove et al. | Mar 1995 | A |
5406333 | Martin | Apr 1995 | A |
5689301 | Christopher et al. | Nov 1997 | A |
5835163 | Liou et al. | Nov 1998 | A |
5852473 | Horne et al. | Dec 1998 | A |
5903319 | Busko et al. | May 1999 | A |
5929902 | Kwok | Jul 1999 | A |
6014182 | Swartz | Jan 2000 | A |
6055018 | Swan | Apr 2000 | A |
6058140 | Smolenski | May 2000 | A |
6148035 | Oishi et al. | Nov 2000 | A |
6154257 | Honda et al. | Nov 2000 | A |
6157412 | Westerman et al. | Dec 2000 | A |
6205177 | Girod et al. | Mar 2001 | B1 |
6236806 | Kojima et al. | May 2001 | B1 |
6563550 | Kahn et al. | May 2003 | B1 |
6633612 | Selby | Oct 2003 | B2 |
7453941 | Yamori et al. | Nov 2008 | B1 |
20010002853 | Lim | Jun 2001 | A1 |
20030098924 | Adams et al. | May 2003 | A1 |
20050018086 | Lee et al. | Jan 2005 | A1 |
20050123274 | Crinon et al. | Jun 2005 | A1 |
20060119702 | Lin | Jun 2006 | A1 |
20060139491 | Baylon | Jun 2006 | A1 |
20060215057 | Tanaka | Sep 2006 | A1 |
20070031129 | Hosokawa | Feb 2007 | A1 |
20080122973 | Iwasaki | May 2008 | A1 |
Number | Date | Country |
---|---|---|
2000-270332 | Sep 2000 | JP |
2004-7719 | Jan 2004 | JP |
Entry |
---|
International Search Report for PCT/GB2008/001276, dated Jul. 21, 2008. |
Office Action (5 pgs.) dated Apr. 7, 2011 issued in corresponding Chinese Application No. 200880015053.3 with an at least partial English-language translation thereof (6 pgs.). |
ITU-R Rec. BT.601, “Studio encoding parameters of digital television for standard 4:3 and wide screen 16:9 aspect ratios,” http://www.itu.int/rec/R-REC-BT.601/en. |
Decision to Grant a European Patent (1 pg.) dated Oct. 25, 2012 issued in corresponding European Application No. 08 736 941.9. |
Office Action (2 pgs.) dated Sep. 8, 2011 issued in corresponding Chinese Application No. 200880015053.3 with an at least partial English-language translation thereof (3 pgs.). |
Office Action (3 pgs.) dated Mar. 29, 2010 issued in corresponding European Patent Application No. 08 736 941.9. |
Applicant Response to Office Action dated Mar. 29, 2010 issued in corresponding European Patent Application No. 08 736 941.9. |
Office Action (4 pgs.) dated Jun. 23, 2014 issued in corresponding Korean Application No. 2009-7024444 with an at least partial English-language translation thereof (2 pgs.). |
Number | Date | Country | |
---|---|---|---|
20100141835 A1 | Jun 2010 | US |