The present invention relates to methods and apparatus for the pre-processing of moving pictures before encoding. In particular, the present invention relates to methods and apparatus for determining whether a digital picture frame is an interlaced-scan picture or a non-interlaced-scan picture; identifying a repeated-field; and detecting a scene-change in a sequence of moving pictures.
Encoding methods such as the well known MPEG-1 and MPEG-2 standards have been popularly used for efficient transmission and storage of video. An MPEG encoder compresses an input video signal picture-by-picture to produce an output signal or bitstream compliant to the relevant MPEG standard. Pre-processing techniques can be applied to the input video signal before encoding, for example, to remove noise and re-format the signal (eg. 4:2:2 to 4:2:0 conversion, image size conversion, etc.).
The input video signal is typically in an interlaced format, for example the 525/60 or 625/50 (lines/frequency) format, with each video frame consisting of two fields (top field and bottom field). However, the source material of the video signal may be originally produced on film and converted to the video signal via a telecine process. This process converts a progressive source into an interlaced format and provides at the same time, if necessary, frame rate conversion for example using a 3:2 or 2:2 pulldown technique. In the case of 24 Hz film to 525/60 Hz video conversion, each progressive film picture is converted to two interlaced video fields and, in addition, there are 12 repeated fields according to the 3:2 pulldown patterns in every second of the converted video. Improvement in coding efficiency can be obtained if the video source from film is identified and the repeated (or redundant) fields are detected and removed before coding. Pre-processing techniques applied before encoding can also gain from the results of film picture detection.
The known methods of film mode detection can be widely classified into two categories: (1) film mode detection using film-frame pattern identification; and (2) film mode detection using automatic interlace/progressive frame detection.
The output of the type of method using film-frame pattern identification is a decision whether the input sequence is an interlaced video or a 3:2/2:2 pulldown film. The detection tries to identify the unique pattern of a 3:2 or 2:2 pulldown film. One of the most commonly used techniques is to detect the repeat field pattern in the 3:2 pulldown film (as described in U.S. Pat. Nos. 5,317,398 and 5,398,071). The pixel to pixel field differences between alternate fields (fields with the same parity) are measured to identify whether the 3:2 repeat field pattern exists.
Another commonly used assumption is that the field differences between two interlaced fields is significantly greater than the field difference between two non-interlaced (or progressive) fields. One method is to group the successive fields that have the least field differences as a film frame (as described in U.S. Pat. No. 5,565,998). Another method is to measure the consecutive field differences of incoming fields and monitor the pattern to decide if it is an interlaced video, 3:2 film or 2:2 film (as described in U.S. Pat. Nos. 5,365,273 and 5,689,301). In the above methods, the unique pattern is monitored for a period (typically spanning 5 to 64 fields) before a decision is made.
With the method of film mode detection using automatic interlace/progressive frame detection, apart from deciding whether an incoming sequence is a film, this type of detection also determines if a frame is interlaced or progressive and identifies a repeated field. Due to the inclusion of the interlace/progressive detection for every frame, it does not have the slow response in interlace/progressive encoding as in the film-frame pattern identification methods described above. One of the methods used for the interlace/progressive detection, such as in U.S. Pat. No. 5,452,011, is the intra-field and inter-field difference (IIFD) comparison. The IIFD method compares the inter-field and intra-field differences to detect whether two consecutive fields are interlaced. The assumption is that the inter-field difference will be greater than the intra-field difference.
In most of the current video/film detection methods which have no automatic interlace/progressive detection, when there is a transition from interlaced video to film, the decision switching is made after a delay of a period typically spanning 5 to 64 fields. This means that the encoding of the film frames in this delay period is still done in interlace mode and redundant fields in this period are not removed before encoding. Similarly when there is a transition from film to interlaced video, the interlaced video frames in the decision switching delay period are still encoded as progressive frames.
A film sequence is often being edited, and a scene change may occur in any field. Sub-titles might also be added to any field of the film, thereby changing the 3:2 repeat-field pattern of the film so that the fames are not always progressive. Interlaced video sequences also consist of some progressive frames due to very little or no motion in between these fields. The current film detection methods which have no automatic interlace/progressive detection will not be able to detect these interlaced frames within a film and the progressive frames within the interlace video.
It is therefore an object of the present invention to address the above-mentioned problems by detecting whether a frame is interlace or progressive immediately after receiving the frame data so that the encoder can encode the frame as interlace or progressive according to the detection decision, or to at least provide a useful alternative.
For existing automatic interlace and progressive detection methods, which compare the intra-field and inter-field differences to make the detection decision, the comparison is not always accurate. The inaccuracy can be due to the inter-field difference being very small, because of little or no motion between successive frames, or to the intra-field difference being large because of very detailed texture or information within the field.
There are also inaccuracy problems in detection methods which assume that interlace difference is significantly greater than progressive difference. The problem which arises from this assumption is that when the previous field (fN−1 ) and current field (fN,) have little or no motion, the interlaced field difference between fN−1 and fN might not be significantly greater than the difference between the progressive fields fN and fN+1.
The present invention is also intended to improve the accuracy of the interlace/progressive detection by making the detection decision which is not only based on the comparison between the interlace difference and the progressive difference, but also on the moving activities between successive frames. This is to check if an insignificant field difference between fN−1 and fN is due to little motion, so as to avoid an incorrect decision due to the insignificant interlace difference.
The present invention provides a method of processing video data to detect field characteristics of the data, said data having a plurality of fields, including the steps of:
The present invention further provides an apparatus for processing video data to detect field characteristics of the data, said data having a plurality of fields, including:
A preferred embodiment of the present invention is described hereinafter, by way of example only, with reference to the accompanying drawings, wherein:
In the preferred embodiment of the present invention, only two field memory units 101 and 102 are required. Referring to
Preferably, for all the sub-block measurements, each field is divided into 32 equal sub-blocks.
The block diagram of the consecutive field difference unit 106 is illustrated in
PD=Min(|A−B|, |A−C|)
The PD of every pixel in field N is computed and the values of PD less than Tnoise are regarded as noise and set to zero. The consecutive field difference CFD(N−1 ,N), of field fN−1 and field fN, is defined as the sum of all the PDs in field fN. The reason for selecting the lesser of the two differences is that this will reduce inaccuracies in the calculation of the field differences arising from abnormal vertical displacement or horizontal edges. To decide whether field fN and fN+1 are interlaced or progressive, the computation of the CFD(N−1,N) and CFD(N,N+1) is required.
The number of sub-block ‘moving pixels’ between fields fN−1 and fN+1 is also computed by the sub-block moving pixel counter 109 to find out if there is significant motion between fields fN−1 and fN+1. The moving-pixel(N−1,N+1) is defined as the pixel in each sub-block (preferably 32 sub-blocks per field) between field fN−1 and fN+1 with pixel-to-pixel difference greater than a threshold Tmove.
A decision-making flow diagram is shown in
If there is a scene change between fN−2 and fN at step 401, then it may be meaningless to compare CFD(N−1,N) to CFD(N,N+1) as the scene change may occur between fN−1 and fN, causing the value of CFD(N−1,N) to be arbitrary. The decision can only be based on the information in fields fN and fN+1. Therefore when there is a scene change detected (between current field fN and second previous field fN−2), then the moving region detection (MRD) method is used at step 402. The MRD method detects any ‘jagged region’ or ‘moving region’ which is noticeable when two ‘moving’ consecutive fields are interlaced and viewed as a frame.
Referring now to
Repeat field detection is performed on a pair of fields of the same parity (odd or even). The field similarity measurement is again preferably based on 32 sub-blocks in which the absolute sum of all the pixel-to-pixel differences of each block is accumulated in the accumulator 104. The repeat-field decision unit 105 operates as follows: The pixel differences for each sub-block difference (SBD) are compared to a threshold Trepeat, ie.,
SBD/(block_width×block_height)<Trepeat for all sub-blocks
If the pixel differences are smaller than Trepeat for all 32 of the sub-blocks, then a repeat field is said to be detected and can be skipped for encoding by the field grouping decision unit 110. It should be noted that the repeated field detection is performed only when the incoming frame is detected as progressive by the interlace/progressive detection.
To prevent an incorrect consecutive repeat field being detected due to very little motion, the following algorithm is implemented:
where prev_decision1 is the first previous decision for repeat field detection and prev_decision3 is the third previous decision; scene-change is the scene change detection decision; and moving-pixel is the number of pixels with pixel difference greater than Tmove computed in the sub-block moving pixel counter 109. A suitable value for Trepeat has been found to be around 2.5.
The differences between the current field and the previous field of the same parity are used to detect any significant change of scene. Making use of the sub-block difference (SBD), a simple thresholding method is employed by the scene change decision unit 108. Each block difference per pixel is compared with a threshold Tscene. If more than Tblock of the sub-blocks has its difference per pixel greater than Tscene, then a scene change is detected, ie.
SBD/(block_width×block_height)>Tscene for more than Tblock sub-blocks
Apart from the above detection, a scene change is also detected by comparing the current field difference with the previous field difference to see if the current field difference has a sudden increment due to a scene change. The field difference (FD) is the sum of all the 32 absolute block differences. If the current field difference is more than Tratio1 times greater than the previous field difference (prev_FD), then a scene change is said to be detected. The pseudocode of the scene change detection algorithm is as follows:
In a 3:2 pulldown film sequence, subtitles may be added to a repeated field, resulting in the field not being detected as a repeat field. When this particular field becomes the current field, the current FD computed (between the current field N and second previous field N−2) will have a small value (because of the small change due to the subtitles). Therefore, in updating the previous field difference (prev_FD), the condition ‘prev_FD/FD<Tratio’ is to avoid updating a ‘repeat field difference’ which will affect the scene change decision made later.
The prev_scene-change is a scene change decision of a previous frame. When there is a scene change detected in the previous frame, then the condition ‘prev_FD/FD<Tratio2’ might not be true due to the large value of prev-FD and hence the criteria ‘prev_scene_change=Yes’ will force an update of prev_FD. Suitable values for Tscene, Tblock, Tratio1 and Tratio2 have been found to be about 15, 25, 2.5 and 3.0 respectively.
An advantage of embodiments of the present invention is to make accurate decisions as to whether a frame should be encoded as an interlace or progressive frame immediately after the second field of the frame is received. This enables the MPEG encoder to encode the frame as interlace or progressive accordingly accurately, including those odd interlaced frames within a film sequence due to editing or the odd progressive frames within an interlaced video sequence. In the above-described interlace/progressive determination method, apart from comparing the consecutive field differences, the moving activities between two successive frames is also computed to ensure that interlaced fields with little or no motion will not cause an incorrect decision. The present invention also addresses the situation where the scene change occurs in the current frame. The moving region detection method is then used for the interlace/progressive determination.
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/SG99/00014 | 2/26/1999 | WO | 00 | 12/21/2001 |
Publishing Document | Publishing Date | Country | Kind |
---|---|---|---|
WO00/51355 | 8/31/2000 | WO | A |
Number | Name | Date | Kind |
---|---|---|---|
4661853 | Roeder et al. | Apr 1987 | A |
4723163 | Skinner | Feb 1988 | A |
5317398 | Casavant et al. | May 1994 | A |
5365273 | Correa et al. | Nov 1994 | A |
5398071 | Gove et al. | Mar 1995 | A |
5452011 | Martin et al. | Sep 1995 | A |
5460420 | Perkins et al. | Oct 1995 | A |
5508750 | Hewlett et al. | Apr 1996 | A |
5521644 | Sezan et al. | May 1996 | A |
5561477 | Polit | Oct 1996 | A |
5565998 | Coombs et al. | Oct 1996 | A |
5689301 | Christopher et al. | Nov 1997 | A |
5828786 | Rao et al. | Oct 1998 | A |
5874995 | Naimpally et al. | Feb 1999 | A |
6014182 | Swartz | Jan 2000 | A |
6084641 | Wu | Jul 2000 | A |
6157412 | Westerman et al. | Dec 2000 | A |
6262773 | Westerman | Jul 2001 | B1 |
6714594 | Dimitrova et al. | Mar 2004 | B2 |
6934335 | Liu et al. | Aug 2005 | B2 |
7068722 | Wells | Jun 2006 | B2 |
7075581 | Ozgen et al. | Jul 2006 | B1 |
7113221 | Law et al. | Sep 2006 | B2 |
7180548 | Mishima et al. | Feb 2007 | B2 |
Number | Date | Country |
---|---|---|
WO 9515659 | Jun 1995 | WO |