Interlaced scanning and progressive scanning are typical scanning methods employed in a video display device. Interlaced scanning has been employed in current National Television Systems Committee (“NTSC”) television (“TV”) systems. For the video display shown in
Deinterlacing an interlaced signal provides numerous advantages for improving video quality. Specifically, deinterlacing can remove interlace motion artifacts, increase apparent vertical resolution, and reduce flicker. Furthermore, deinterlacing is often required because modern televisions are inherently progressive and the video feed is broadcast in interlaced form.
There are three common techniques for deinterlacing an interlaced video signal. A first deinterlacing technique is known as weaving. Weaving involves combining two adjacent fields into one frame. While this technique maintains vertical resolution, it has the problem of creating interlace motion artifacts if motion is present.
A second deinterlacing technique is known as vertical interpolation. Vertical interpolation involves averaging at least two scan lines to generate a new scan line. The technique is repeated for all scan lines and creates a full frame from a single video field. While vertical interpolation allows a progressive picture to be generated from one video field, half of the resolution of the video feed is lost.
Another deinterlacing technique is known as motion adaptive deinterlacing. For this technique, adjacent fields are merged for still areas of the picture and vertical interpolation is used for areas of movement. To accomplish this, motion, on a sample-by-sample basis, is detected over the entire picture in real time, requiring processing of several fields of a video signal.
To improve the results of the common techniques, a cadence detection algorithm can be implemented. In order for the progressive source to be converted to an interlaced format, each frame from that source must be represented as multiple fields in the interlaced format - i.e., a single frame is converted to 2or more interlaced fields. The conversion process (called telecine) typically results in a regular repeating pattern of fields taken from an original progressive frame. For example, to convert 24 frame/sec film to 60 field/sec interlaced video, a technique known as 3:2 pulldown is used. The technique converts one film frame to 3 fields, the next frame to 2 fields, the next to 3, etc. The result is a regular 3/2/3/2 repeating pattern, or cadence, which can be detected. If the cadence is known, then the original film frame can be reconstructed by simply combining the correct two fields in a weaving operation. However, cadence detection systems are inadequate because they are generally limited to 3:2 pulldown or 2:2 pulldown. Further cadence detection systems incur a number of problems when the cadence pattern is broken for some reason, such as video edits.
The foregoing examples of the related art and limitations related therewith are intended to be illustrative and not exclusive. Other limitations of the related art will become apparent to those of skill in the art upon a reading of the specification and a study of the drawings.
The following embodiments and aspects thereof are described and illustrated in conjunction with systems, tools, and methods that are meant to be exemplary and illustrative, not limiting in scope. In various embodiments, one or more of the above-described problems have been reduced or eliminated, while other embodiments are directed to other improvements.
A technique for deinterlacing an interlaced video stream involves the detection and merging of fields. An example of a method according to the technique involves detecting an occurrence of groups of adjacent fields that are derived from a common original image frame source. Field pairs of the interlaced video stream can be merged to form a deinterlaced signal.
In certain embodiments, the detection of an occurrence of groups of adjacent fields that are derived from a common image frame source can involve determining if fields pairs of the interlaced video stream are similar. If the fields pairs are similar, the field pairs can be merged. If more than two adjacent field pairs are different, an alternate deinterlacing technique can be implemented, such as motion adaptive deinterlacing.
In some embodiments, determining if field pairs are similar can involve performing a correlation operation between field pairs. In other embodiments, a difference operation between field pairs can be performed. Additional embodiments can determine whether a scene transition occurred in order to aid in the determination of similar and different field pairs. An alternate embodiment can involve calculating a threshold value based on one or more factors. Following this embodiment, the factors can include a history of correlation operation values between field pairs, a minimum and maximum value in the history of correlation operation values, a minimum and maximum value for a range of correlation values, detecting scene transitions, and assigning state values.
Another example a method according to the technique involves detecting an occurrence of groups of consecutive temporal fields of interlaced video fields that are derived from a common original image frame source. A repeating pattern in the groups can be detected and locked. Fields pairs can be combined based on the repeating pattern.
The proposed method and device can offer, among other advantages, improved quality of deinterlaced video. This can be accomplished in an efficient and robust manner compared to other deinterlacing techniques because a cadence can be detected independent of the source. These and other advantages of the present invention will become apparent to those skilled in the art upon a reading of the following descriptions and a study of the several figures of the drawings.
Embodiments of the inventions are illustrated in the figures. However, the embodiments and figures are illustrative rather than limiting; they provide examples of the invention.
In the following description, several specific details are presented to provide a thorough understanding of embodiments of the invention. One skilled in the relevant art will recognize, however, that the invention can be practiced without one or more of the specific details, or in combination with other components, etc. In other instances, well-known implementations or operations are not shown or described in detail to avoid obscuring aspects of various embodiments, of the invention.
A method is described for source adaptive deinterlacing of an interlaced video signal to create a progressive video signal. As described further below, the method includes determining if a sequence of two or more fields from the interlaced source is derived from the same original progressive source frame. One method of determining if a sequence of two fields have a common source is to determine how similar the two fields are to each other. The more similar the two fields are to each other, the more likely it is that they have been taken from the same original source frame. Once it has been determined that two fields come from the same original source frame, they can be merged to reconstruct the original source frame.
According to certain embodiments of the present invention, the method of deinterlacing by determining if two fields have a common source can recognize and adapt to any cadence or cadences in the video signal. Further, once it has been determined that two fields come from the same original source frame, they can be merged without need of any further processing, such as calculating pixels.
For the example of
The deinterlacing module 203 of
Processing system 60 includes a signal input that receives a video signal from a signal source 80. Signal source 80, may be either a single channel signal. source or a multiple channel signal source. A single channel signal source provides programming from a recorded medium, such as a videocassette, compact disc, DVD, etc. Examples of a single channel signal source include a videocassette recorder, a CD player, and a DVD player. A multiple channel signal source includes any system or device that is capable of sending a signal that may be received by a satellite receiver, a cable or optic connection, a terrestrial antenna, or the like. Examples of a multiple channel signal source include Digital Satellite System (“DSS”), Digital Video Broadcasting (“DVB”), a cable box, locally broadcast programming (i.e. programming broadcast using Ultra High Frequency (“UHF”) or Very High Frequency (“VHF”), and so forth.
The output of video processor 60 goes to a video display device 70, such as an HDTV, a standard definition TV, a computer monitor, etc. The display 70 can employ various display techniques, such a plasma display, a cathode ray tube (“CRT”), a LCD display, a DLP display, and a projector, for example.
A video processor 60 employing a deinterlacing module 203 receives an interlaced video signal from video signal source 80 and provides a deinterlaced signal to display device 70.
For alternative embodiments, the deinterlacing module 203 can be part of the video signal source apparatus 80 or the video display device 70. For alternative embodiments, the video processor 60 can be part of the video signal source 80 or the video display device 70.
Method 300 utilizes the observation that the more similar two or more fields are to each other, the more likely they are to have been taken from the same original video image. The method 300 also utilizes the fact that conversion of a slower frame rate progressive source to a faster frame rate interlaced format requires that multiple sequential fields be taken from the same original source frame. In other words, there will be a sequence of adjacent fields that are all taken from the same source frame.
According to embodiments of the present invention, a correlation operation may be performed on two fields that are spaced “x” fields is apart, where x>0. However, performing a correlation operation on fields that are in the vicinity of each other, for instance, are spaced 1 or 2 fields apart, is more useful and reliable, because it provides a more frequent indication of any changes in field data.
The 2-field operation compares fields that are spatially coincident. Thus, for an even-odd sequence of fields, the fields being compared are both composed of either even-numbered lines or odd-numbered lines. Accordingly, a 2-field difference. The magnitude of the pixel calculating a pixel-by-pixel difference to obtain a 2-field difference. The magnitude of the pixel difference values can be summed over the entire field, with the resultant sum indicating how different or similar the two fields are overall. A low resultant sum value indicates very similar fields while a high resultant sum value indicates very different fields. The terms “low” and “high”, as well as “similar” and “different” as used herein, are, of course, relative.
The 1-field operation involves comparing two fields that are not spatially coincident—i.e., for an even-odd sequence of fields, one field is composed of even-numbered lines while the other is composed of odd-numbered lines. Thus, because the two fields are not spatially coincident, they cannot be directly compared. In addition, there may be aliasing present in a single field due to the fact that taking only every other line of a source video image may not generate a high enough vertical sampling rate to represent all the vertical high frequencies in the video signal. The aliases are different between the two fields being compared, causing yet another difference between them.
From each field a comparison field is created by phase-shifting one field up and the other down. For one embodiment, one field is phase-shifted ¼ line up and a field to be compared ¼ line down. The phase shift can be performed for instance, using a Finite Impulse Response (“FIR”) filter approach, although other techniques such as simple linear interpolation can also be used. For this embodiment, a comparison field is created from each field by computing pixels from the pixels of each field and comparing the calculated pixels. Additionally, a comparison field can be created from one of the two fields by computing pixels from the pixels of that field and the resultant calculated pixels can be compared with the pixels of the other field. The resulting field can be compared with an original field in the same manner as used for the 2-field difference described above.
In general, however, the 1-field difference has a higher baseline or noise level than the 2-field difference due to the fact that the original two fields being compared are not spatially coincident. Therefore, one or more operations may also be performed on each of the two original fields before the phase shift to reduce noise and aliasing artifacts. One such operation is a vertical low-pass filtering operation.
According to certain embodiments of the invention, both the 1-field difference and the 2-field difference may be calculated. The 2-field difference provides useful measurements when used on certain cadences, like 3:2 pulldown, but is not very useful when used on a cadence that has only two fields taken from the same frame, such as 2:2 pulldown.
On the other hand, as illustrated in
In the description of
At block 411 of
Based on the saved history, at block 421 of
For one embodiment, new correlation values are adapted before being saved in the history. For instance, if a new correlation value is much larger than the maximum value currently in the history buffer, then the new value will be reduced in amplitude to avoid abnormally skewing the history data. A correlation value can be much higher than a maximum value due to noise, momentary changes in the image data, or a scene transition. If a long term increase in the correlation value is actually present, the history values will adjust over time to match the incoming values.
Once new correlation values have been added, the updated history record is examined to determine the minimum and maximum correlation values. The sequence of history values is also examined to determine if the higher values (which represent fields which are from different source frames) have a decreasing trend, or if the lower values (which represent fields which are from the same source frame) have an increasing trend. Higher maximums and lower minimums are tracked by their very nature, but a downward maximum or upward minimum trend must be consciously tracked. For instance, as shown in
As discussed above, the 1-field difference measurement has a higher baseline value than the 2-field difference due to the fact that a pair of fields will almost never perfectly correlate. According to certain embodiments of the present invention, a method to compensate for this discrepancy is provided. A multiple of the minimum value in the history record is subtracted from both the minimum and the maximum values before they are provided as outputs. A larger multiple is used when the dynamic range value is high and a multiple of 1 is used when the dynamic range is low. One net result of this is that the output minimum is always zero.
In addition, scene transition occurrences in a video sequence can also be detected. A scene transition normally results in a momentary spike, as illustrated in
At block 431 of
Threshold values for 1-field operations and 2-field operations are calculated slightly differently. For one embodiment, for both operations, the threshold value is a fraction of the dynamic range and is calculated by using the following equation: [Dynamic Range/Scaling Factor]+Minimum. The scaling factor varies depending on various factors.
The scaling factor for calculating the threshold for the 1-field operation depends largely on the size of the maximum. Because the 1-field minimum is always zero, the maximum is effectively the same as the dynamic range. Larger maximum values cause a larger scaling factor to be used, with a nominal range of 6 to 64. The threshold is biased towards the minimum rather than the middle of the range for two primary reasons. First, the larger correlation values tend to vary more than the minimums, and secondly, because it is better to incorrectly decide that two fields are different than to incorrectly decide that they are the same. In the event of a scene transition, the threshold is decreased to prevent false detections when the transition is from a higher average motion level to a lower average motion level. In such a case, a false detection would indicate that fields are similar when they are really not, resulting in interlace motion artifacts being present in the deinterlaced video signal. If a downward maximum trend is present, then a transition has no effect on the threshold calculation because the threshold level will already have been depressed by the decreasing maximum values.
The 2-field threshold scaling factor is also calculated as a fraction of the 2-field difference dynamic range. The calculation rules are a bit different, however, as the minimum is often not zero and more truly represents the difference between the 2 fields. When the minimum is very small, then the dynamic range scaling factor is chosen based on the minimum amplitude. For larger minimums, the threshold scaling factor is a fixed value. Like the 1-field difference threshold, the 2-field threshold is depressed in the event of a scene transition.
At block 441, the 1-field difference and 2-field difference values are each compared to their respective thresholds. If the difference value is above the threshold, the result is a “1”(or “different”) and if it is below the threshold, the result is a “0”(or “similar”). A history of the comparison results for both 1-field and 2-field differences is maintained for a set number “m” of previous field periods. For one embodiment, m is equal to at least two times n, because to recognize a repeating pattern, at least two instances of the pattern need to be present. The comparison history may be saved in a history buffer.
At block 451 of
At block 461 of
The basis of the state value assignment is the observation that conversion of a slower frame rate progressive source to a faster frame rate interlaced format requires that multiple sequential fields be taken from the same original source frame. In other words, there will be a sequence of adjacent fields that are all taken from the same source frame. Such a sequence must have a first field and a last field, and may have one or more middle fields.
The possible state values are therefore “Start”, “Middle”, and “End” when a repeating sequence exists. When no repeating sequence exists a state value of “None” is used. Because the 3:2 pulldown technique is very common, and because a 2-field difference is a very reliable indicator of such a pattern, an additional state assignment can be made to cover the two field periods not handled by the other states for the 2-field difference pattern. This state assignment is termed “InBetween.” Thus, as shown in
The state assignments are based on the comparison history as well as the previous state values. The following is a non-exhaustive list of examples of state assignment:
The state assignments are used to determine which fields to use to create the progressive video sequence, as described below.
According to certain embodiments of the present invention, other factors may be also used in determining the presence or absence of a repeating field sequence. One such factor is a “pair quality metric.” This metric can be assigned when two sequential 2-field difference comparisons are made of fields that come from different original source frames. In such a case, one 2-field difference value is a comparison between the even field of original source frame “N” and the even field of original source frame “N+1”, while the other is a comparison of the odd fields of those two source frames. For instance, referring to
The pair quality metric is only assigned when the 1-field and/or 2-field state values indicate that the current and previous 2-field difference values were each comparing at least one field from the same source frames. This second of the pair of 2-field difference values occurs when the state is determined to be “End”, and the first of the pair occurs in the previous field period. The magnitude of the difference between the pair of 2-field difference values is compared to the larger of the two values. Based on the comparison result, a quality value (e.g., very good, good, medium, bad, very bad) is assigned to that pair.
At block 471, based on the correlation history, state, any detected cadence, dynamic range quality, pair quality and scene transition values, the method determines if a lock can be acquired on the signal. A lock is considered to have occurred on the signal when the indicators as a whole provide a high confidence that a field sequence has come from a progressive source by extracting multiple fields from each original source frame. In other words, the signal has been recognized as having a valid repeating-field pattern and field pairs can be combined to deinterlace the video signal. A lock can be acquired on a sequence as long as a repeating pattern is detected, regardless of what the pattern actually is.
In order for lock to occur, a repeating pattern must exist. Ideally, the 1-field and 2-field difference values both agree on the pattern type. In certain cases, only one of these is required for lock as long as other quality metrics are sufficiently high. Once lock has initially occurred, the system stays in lock until some other event causes lock to be cleared. In general, it is harder to acquire lock than to lose it. The basic idea is that there needs to be a high confidence in the signal before lock is set and field pairs are combined to form the output, but that once lock has occurred, many types of variations in the sequence pattern can be tolerated. Once the system is in lock the cadence does not need to remain constant and could, for example, change back and forth between common patterns such as 3:2 or 2:2 pulldown without losing lock.
Some factors which could cause loss of lock include scene transitions which cause the 1-field state to become “None,” or both state values being “None,” or very poor quality metrics, or conflicting 1-field and 2-field state values. Recognized cadences that can cause lock to be acquired are 2:2, 3:2, 3:3, 4:4, 5:5, 4:2, 2:2:2:4, 2:3:3:2, 3:2:3:2:2, 6:4 and 8:7.
At block 481 of
Once lock has been acquired, field pairs are combined to form a deinterlaced output stream. The lock state alone does not solely enable this combination however. Rather, multiple factors must be present for field pairs to be combined to form the output. These factors include lock, current and previous states, dynamic range and field pair quality, and cadence.
There are “strong” and “weak” detections for the various field combination possibilities. A strong detection generally occurs when both the 1-field and 2-field measurements are in agreement and the quality metrics are not too low. Weak detections occur when only one of the 1-field or 2-field measurements is valid and requires that the quality metrics are high. There can be a prioritized sequence of decisions regarding which fields to combine, with strong detections having precedence and the weak detections being valid only if the strong detections are not. This prevents a weak detection decision criteria from being used when a higher-confidence, strong detection is present.
As described above, when the current state is “Start” or “Middle,” then the current and previous fields are combined. When the state is “End,” the previous and second previous fields are combined. When the 2-field state is “InBetween” (and there is no conflicting 1-field state), the first of the two “InBetween” states is treated as a “Start” and the second is treated as an “End,” When no valid overall system state is determined to exist, then the lock signal is de-asserted, combination of field pairs stops, and the system drops back to motion-adaptive deinterlacing. At block 491, a deinterlaced frame is output.
All of the various measurements and quality metrics, particularly as described in reference to
While
Embodiments of the invention can take the form of instructions, such as program modules, being executed by processors. Generally, program modules include routines, programs, objects, components, data structures, etc., that perform particular tasks or implement particular abstract data types. The sequences of instructions implemented in a particular data structure or program module represent examples of corresponding acts for implementing the functions or steps described herein.
As used herein, the term “embodiment” means an embodiment that serves to illustrate by way of example but not limitation. Also, reference has been made to an image represented by pixels. However, in other embodiments, the image can be represented by any convenient and/or known discrete component that forms the basic unit of the composition of an image.
It will be appreciated to those skilled in the art that the preceding examples and embodiments are exemplary and not limiting to the scope of the present invention. It is intended that all permutations, enhancements, equivalents, and improvements thereto that are apparent to those skilled in the art upon a reading of the specification and a study of the drawings are included within the true spirit and scope of the present invention. It is therefore intended that the following appended claims include all such modifications, permutations and equivalents as fall within the true spirit and scope of the present invention.
This Application claims the benefit of U.S. Provisional Application No. 60/715,711 entitled “Source-Adaptive Video Deinterlacer,” filed on Sep. 8, 2005, which is incorporated by reference.
Number | Name | Date | Kind |
---|---|---|---|
4997321 | Adams | Mar 1991 | A |
5357606 | Adams | Oct 1994 | A |
5532751 | Lui | Jul 1996 | A |
5550592 | Markandey et al. | Aug 1996 | A |
5689301 | Christopher et al. | Nov 1997 | A |
5790269 | Masaki et al. | Aug 1998 | A |
5796875 | Read | Aug 1998 | A |
5852475 | Gupta et al. | Dec 1998 | A |
5857118 | Adams et al. | Jan 1999 | A |
5920356 | Gupta et al. | Jul 1999 | A |
6055018 | Swan | Apr 2000 | A |
6064776 | Kikuchi et al. | May 2000 | A |
6069664 | Zhu et al. | May 2000 | A |
6167164 | Lee et al. | Dec 2000 | A |
6219747 | Banks et al. | Apr 2001 | B1 |
6285801 | Mancuso et al. | Sep 2001 | B1 |
6380978 | Adams et al. | Apr 2002 | B1 |
6385692 | Banks et al. | May 2002 | B2 |
6393505 | Scalise et al. | May 2002 | B1 |
6421090 | Jiang et al. | Jul 2002 | B1 |
6459455 | Jiang et al. | Oct 2002 | B1 |
6473476 | Banks | Oct 2002 | B1 |
6489998 | Thompson et al. | Dec 2002 | B1 |
6515706 | Thompson et al. | Feb 2003 | B1 |
6587158 | Adams et al. | Jul 2003 | B1 |
6621937 | Adams et al. | Sep 2003 | B1 |
6681059 | Thompson | Jan 2004 | B1 |
6700622 | Adams et al. | Mar 2004 | B2 |
6757022 | Wredenhagen et al. | Jun 2004 | B2 |
6757442 | Avinash | Jun 2004 | B1 |
6859237 | Swartz | Feb 2005 | B2 |
6867814 | Adams et al. | Mar 2005 | B2 |
6975776 | Ferguson | Dec 2005 | B2 |
6999047 | Holtslag | Feb 2006 | B1 |
7023487 | Adams | Apr 2006 | B1 |
7027099 | Thompson et al. | Apr 2006 | B2 |
7089577 | Rakib et al. | Aug 2006 | B1 |
7126643 | Song et al. | Oct 2006 | B2 |
7136541 | Zhang et al. | Nov 2006 | B2 |
7154556 | Wang et al. | Dec 2006 | B1 |
7206025 | Choi et al. | Apr 2007 | B2 |
7236209 | Martin | Jun 2007 | B2 |
7257272 | Blake et al. | Aug 2007 | B2 |
7345708 | Winger et al. | Mar 2008 | B2 |
7349028 | Neuman et al. | Mar 2008 | B2 |
7362376 | Winger et al. | Apr 2008 | B2 |
7391468 | Shah | Jun 2008 | B2 |
7400359 | Adams | Jul 2008 | B1 |
7412096 | Neuman et al. | Aug 2008 | B2 |
7414671 | Gallagher et al. | Aug 2008 | B1 |
7417686 | Zhu | Aug 2008 | B2 |
7474354 | Kawamura et al. | Jan 2009 | B2 |
7515205 | Wang et al. | Apr 2009 | B1 |
7519332 | Suematsu | Apr 2009 | B1 |
7529426 | Neuman | May 2009 | B2 |
7551800 | Corcoran et al. | Jun 2009 | B2 |
7557861 | Wyman | Jul 2009 | B2 |
7605866 | Conklin | Oct 2009 | B2 |
7657098 | Lin et al. | Feb 2010 | B2 |
7659939 | Winger et al. | Feb 2010 | B2 |
7667773 | Han | Feb 2010 | B2 |
7710501 | Adams et al. | May 2010 | B1 |
7865035 | Lin et al. | Jan 2011 | B2 |
7940992 | Johnson et al. | May 2011 | B2 |
7969511 | Kim | Jun 2011 | B2 |
7986854 | Kim et al. | Jul 2011 | B2 |
20020149685 | Kobayashi et al. | Oct 2002 | A1 |
20020149703 | Adams et al. | Oct 2002 | A1 |
20040042673 | Boon | Mar 2004 | A1 |
20040189877 | Choi et al. | Sep 2004 | A1 |
20050122433 | Satou et al. | Jun 2005 | A1 |
20050128360 | Lu | Jun 2005 | A1 |
20060072037 | Wyman | Apr 2006 | A1 |
20070052845 | Adams | Mar 2007 | A1 |
20070103588 | Maclnnis et al. | May 2007 | A1 |
20070223835 | Yamada et al. | Sep 2007 | A1 |
20080123998 | Gomi et al. | May 2008 | A1 |
20080143873 | Neuman | Jun 2008 | A1 |
20080151103 | Asamura et al. | Jun 2008 | A1 |
Number | Date | Country |
---|---|---|
0881837 | Dec 1998 | EP |
1039760 | Sep 2000 | EP |
1434438 | Jun 2004 | EP |
1492344 | Dec 2004 | EP |
2001245155 | Sep 2001 | JP |
2005122361 | May 2005 | JP |
2007213125 | Aug 2007 | JP |
Number | Date | Country | |
---|---|---|---|
20070052846 A1 | Mar 2007 | US |
Number | Date | Country | |
---|---|---|---|
60715711 | Sep 2005 | US |