Not applicable
The present invention relates to de-interlacing used in video devices.
Many video signals are processed in an interlaced format where frames of video signals are separated into odd and even fields that are alternately displayed to produce the illusion of a single image. For example, in NTSC standard video signals, odd and even fields are interlaced every 60th of a second to produce frames at an overall rate of 30 frames per second. In addition, other standard video formats are interlaced such as 480i 720i, 1080i, etc. Deinterlacing is the process of reconstructing whole frames from interlaced frames, for instance when an interlaced signal is converted to a progressive scan signal, such as a 480p, 720p or 1080p formatted signal.
Many deinterlacing algorithms produce undesirable artifacts that are visible to the viewer. The efficient and accurate de-interlacing of video signals is important to the implementation of many video devices, particularly video devices that are destined for home use. Further limitations and disadvantages of conventional and traditional approaches will become apparent to one of ordinary skill in the art through comparison of such systems with the present invention.
In an embodiment of the present invention, the video signal 110 is a broadcast video signal, such as a television signal, high definition television signal, enhanced definition television signal or other broadcast video signal that has been transmitted over a wireless medium, either directly or through one or more satellites or other relay stations or through a cable network, optical network or other transmission network. In addition, video signal 110 can be generated from a stored video file, played back from a recording medium such as a memory, magnetic tape, magnetic disk or optical disk, and/or can include a streaming video signal that is transmitted over a public or private network such as a local area network, wide area network, metropolitan area network or the Internet.
Video signal 110 can include an analog or digital video signal that is formatted in any of a number of interlaced video formats video formats. Processed video signal 112 can include a progressive scan video signal such as a 480p, 720p or 1080p signal or other analog or digital de-interlaced video signal.
Video display device 104 can include a television, monitor, computer, handheld device or other video display device that creates an optical image stream either directly or indirectly, such as by projection, based on decoding the processed video signal 112 either as a streaming video signal or by playback of a stored digital video file.
The deinterlacer 135 can be implemented using a single processing device or a plurality of processing devices. Such a processing device may be a microprocessor, co-processors, a micro-controller, digital signal processor, microcomputer, central processing unit, field programmable gate array, programmable logic device, state machine, logic circuitry, analog circuitry, digital circuitry, and/or any device that manipulates signals (analog and/or digital) based on operational instructions that are stored in a memory. Such a memory may be a single memory device or a plurality of memory devices and can include a hard disk drive or other disk drive, read-only memory, random access memory, volatile memory, non-volatile memory, static memory, dynamic memory, flash memory, cache memory, and/or any device that stores digital information. Note that when the processing module implements one or more of its functions via a state machine, analog circuitry, digital circuitry, and/or logic circuitry, the memory storing the corresponding operational instructions may be embedded within, or external to, the circuitry comprising the state machine, analog circuitry, digital circuitry, and/or logic circuitry.
In accordance with the present invention, the deinterlacer 135 includes many optional functions and features described in conjunction with
The interpolation module 200 generates the deinterlaced video signal 214 based on the first pixel values when corresponding pixel motion values of the motion map 212 are within a first range of values. In particular, for areas of a processed image that include pixels that exhibit motion or an amount of motion that is greater than some motion threshold, spatial interpolation is used to fill in the missing pixels. Further, interpolation module 200 generates the deinterlaced video signal 214 based on the second pixel values when the corresponding pixel motion values of the motion map 212 are within a second range of values. In this fashion, for areas of a processed image that include pixels that exhibit no motion or an amount of motion that is less than some motion threshold, temporal interpolation is used to fill in the missing pixels.
In an embodiment of the present invention, interpolation module 200 generates the deinterlaced video signal 214 by blending the first pixel values and the second pixel values when the corresponding pixel motion value has one of a third range of values. In this embodiment, the third range of values corresponds to low motion that falls between the first and second range of values. For areas of a processed image that include pixels that exhibit an amount of motion that is greater than the threshold that defines the upper boundary of the second range of values, but is less than the threshold that defines the lower boundary of the first range of values pixel values, results from spatial and temporal interpolation are blended to fill in the missing pixels. In particular, the blending itself can be interpolated monotonically, via a linear interpolation function. Or alternatively, the blending can be achieved non-linearly through a pre-defined look-up table. When motion values are closer to the boundary of the second range of values, temporal interpolation can dominate the blending. Similarly, when the amount of motion is closer to the boundary of the first range of values, spatial interpolation can dominate the blending.
The motion detection module 210 also includes an adjacent field module 224 that generates a plurality of motion flags 226 based on a comparison of pixel values between adjacent fields of the video signal 110. The motion integration module 228 generates the motion map 212 by integrating the odd and even field motion map 222 and the plurality of motion flags 226. In particular, each of the motion flags correspond to one of the pixel motion values. The motion integration module 228 increases a pixel motion value when the corresponding motion flag indicates motion.
The operation of motion detection module 210 and interpolation module 200 as presented in conjunction with
Considering the example shown in
Motion detection module 210 generates the odd field motion map and even field motion map in the same way. Thus, three consecutive fields can be considered as the input without identifying odd or even.
Differences can be calculated based on the following equation:
Where F[a,b,c] represents the Y component of pixel row=a, column=b and field=c, and where the Weight is shown in Table 2.
The value of motion level for pixel (x,y) is then quantized as below, thus the number of bits in the motion map is 4 which yields a maximum value of 15 for each pixel motion value.
In the boundary area, there is no 3×3 neighborhood available. In an embodiment of the present invention, the skip field module 220 applies the equation only on the area of:
[1, W−2]×[1, H−2]
where the whole field has the area of [0, W−1]×[0, H−1] with W as its width and H as its height. The values on the first line/column and last line/column can be set to zero.
Adjacent field module 224 uses two adjacent different fields (odd and even, or even and odd) to detect the high frequency part caused by motion. This is especially useful in conditions of high motion when the pixel difference in consecutive odd or even fields is not able to detect motion. In this example, adjacent field module 224 uses both F[n−1] and F[n], and F[n] and F[n+1] to detect motion based on nearby pixels in several local windows presented in Table 3.
Adjacent field module 224 detects vertical and diagonal lines to decide if there is a motion. For example, in the first window shown in Table 3, if the three pixels F[x−1,y−1,n−1], F[x−1,y,n] and F[x−1,y+1,n−1] follow the pattern of lower intensity, higher intensity, lower intensity, then it is possible that this phenomenon is caused by motion. Another pattern (higher intensity, lower intensity, higher intensity again) is the same. The assumption behind this idea is that the spatial correlation is destroyed by new content if there is motion, thus the pixels in the new field interrupts the local spatial smoothness in the previous field.
To improve the robustness, three lines are detected rather than a single line. Vertical lines are detected in the first window and diagonal lines detected in the second and third windows. In one mode of operation, adjacent field module 224 operates as described below:
Taking the first line in the first window as example,
F[x−1,y−1,n−1]<F[x−1,y,n] AND
F[x−1,y+1,n−1]<F[x−1,y,n] AND
|F[x−1,y−1,n−1]−F[x−1,y+1,n−1]|<α AND
|F[x−1,y−1,n−1]−F[x−1,y,n]|>β AND
|F[x−1,y+1,n−1]−F[x−1,y,n]|>β
β=2|F[x−1,y−1,n−1]−F[x−1,y+1,n−1]|
F[x−1,y−1,n−1]>F[x−1,y,n] AND
F[x−1,y+1,n−1]>F[x−1,y,n] AND
|F[x−1,y−1,n−1]−F[x−1,y+1,n−1]|<α AND
|F[x−1,y−1,n−1]−F[x−1,y,n]|>β AND
|F[x−1,y+1,n−1]−F[x−1,y,n]|>β
β=2|F[x−1,y−1,n−1]−F[x−1,y+1,n−1]|
Where, α=15 is pre-defined threshold. When all the three vertical lines follow either one of the two patterns (Lower-Higher-Lower, or Higher-Lower-Higher), then F[x,y,n] is a motion pixel.
For the first line in the second window:
Lower-Higher-Lower pattern:
F[x−1,y−1,n−1]<F[x,y,n] AND
F[x+1,y+1,n−1]<F[x,y,n] AND
|F[x−1,y−1,n−1]−F[x+1,y+1,n−1]|<α AND
|F[x−1,y−1,n−1]−F[x,y,n]|>β AND
|F[x+1,y+1,n−1]−F[x,y,n]|>β
β=2|F[x−1,y−1,n−1]−F[x+1,y+1,n−1]|
Higher-Lower-Higher pattern:
F[x−1,y−1,n−1]>F[x,y,n] AND
F[x+1,y+1,n−1]>F[x,y,n] AND
|F[x−1,y−1,n−1]−F[x+1,y+1,n−1]|<α AND
|F[x−1,y−1,n−1]−F[x,y,n]|>β AND
|F[x+1,y+1,n−1]−F[x,y,n]|>β
β=2|F[x−1,y−1,n−1]−F[x+1,y+1,n−1]|
The same technique can be used for the other windows of Table 3. For boundaries, the same strategy can be applied as used in the skip field module 220, i.e., setting the boundary areas to zero and evaluating only the interior region where the boundaries are defined.
2). Noise reduction module 234 examines and possibly modifies the motion data 232 to prevent false motion detection from happening. Two steps can be used for this noise reduction.
Motion integration module 228 operates to integrate the odd and even field motion maps 222 from the skip field module 220 with the motion flags 226 from the adjacent field module 224. For example, if a motion flag 226 is set for a pixel, then the pixel motion value (the pixel difference) will raised up a certain number of levels, such as 3 levels, to a maximum of 15. The result is the final motion map 212.
It should be noted that motion detection nodule 210 can also be used to detect if two fields are identical (for 3:2 pull-down detection). The key for 3:2 pull-down detection is to find repeated fields. Therefore, deinterlacer 135 can simply count the number of pixels below a threshold in the motion map 226 or the odd and even field motion maps 222 to see if the two fields are almost the same—within some tolerance. The threshold can be chosen as a small number that is above the expected noise level.
Continuing with this example, interpolation module 200 operates in three basic modes, defined by the relationship of pixel motion values of the motion map 212 to low and high thresholds.
In an embodiment of the present invention, the spatial interpolation module 202 generates pixel values for an image within a window or other area of the image based on an interpolation along a best orientation, an interpolation along a local gradient, and/or a weighted vertical interpolation. In operation, the spatial interpolation module can detect a best orientation. For example, one of 33 possible orientations can be detected. There are 16 orientations between 90 degree and 0 degree (indicated as 7.12, 7.59, 8.13, . . . 45, 64) and 16 orientations between 90 degree to 180 degree (indicated as −7.12, −7.59, −8.13, . . . −45, −64). The smallest orientation is
in the full frame domain; i.e., 16 pixels in horizontal and 2 pixels in vertical direction. This is equivalent to
in the field domain.
To detect the orientation, the spatial interpolation module 202 calculates the weighted absolute difference along each orientation. The smaller the difference, the more likely the true orientation is detected. The spatial interpolation module 202 uses a 4×32 neighborhood to calculate the gradient along each orientation.
The formula for the 90 degree orientation is given below:
D
90
1=2|pL(0,N)−pL(1,N)|+4|pL(1,N)−pL(2,N)|+2|pL(2,N)−pL(3,N)|
D
90
2
=|pL(0,N−1)−pL(1,N−1)|+2|pL(1,N−1)−pL(2,N−1)|+|pL(2,N−1)−pL(3,N−1)|
D
90
3
=|pL(0,N+1)−pL(1,N+1)|+2|pL(1,N+1)−pL(2,N+1)|+|pL(2,N)−pL(3,N+1)|
D
90=(D901+D901+D901)>>4
Where, pL(x,y) means the pixel intensity (Y component) in the location of (x,y) in the current field. pL(x,N) means the pixel on the central location in horizontal direction (y direction).
The calculation for other orientations can be found in a similar fashion: calculate the weighted absolute difference and normalize it. The central line(s) is weighted more than the other lines. There are two reasons for that: 1) the central line(s) is closer to the missing pixel thus it should be given more weights; 2) the weights are designed to make division by 2x in order to be hardware friendly.
The spatial interpolation module 202 can perform interpolation along the best detected orientation. However, because of the small vertical neighborhood, the detected orientation is usually not very reliable. Also, the pixel far from the central pixel has relatively weak correlation comparing with its close neighbors, therefore, in many cases, the best orientation judged only by its least difference value may be wrong. This is especially true for small angle orientations (imagine that the two pixels which are 14 pixels far away from two sides will be used for interpolation, risk does exist). For this purpose, a smoothness measure is introduced based on the following observation; if there is a line with a small angle orientation, then it usually has a relatively smooth intensity change within the line segment in the horizontal direction except the boundary between the line and the background.
A general horizontal gradient HGG can also calculated for the neighborhood:
After calculating the orientation, line smoothness measurement, and general horizontal gradient, the spatial interpolation module 202 can apply interpolation along the best orientation. Possible interpolation scenarios can be classified in four types:
1) Smooth area
2) Line segment area
3) Texture area
4) Other area
For a smooth area, the spatial interpolation module 202 can apply a weighted average using a small 2×3 neighborhood. For line segment area, spatial interpolation module 202 can interpolate the missing pixel along the best orientation. If the conditions fail for the above two situations, spatial interpolation module 202 can use a 2×3 neighborhood to determine if a fine and sharp texture region and use an edge adaptive method in this small neighborhood when the edge is strong. Otherwise, spatial interpolation module 202 can use weighted vertical interpolation for any area that does not belong to any one of these other three classes.
A smooth area can be detected by checking if all the pixels in a 4×5 neighborhood have similar intensity value (less than 15). The spatial interpolation for this smooth area is a weighted average for smooth area:
using 2×3 kernel (Wi,j) as the table below:
When strong orientation is detected, usually it indicates a line segment exists. However, when the neighborhood is small in vertical direction, the detection may be wrong. Therefore, several conditions can be used as constraints to increase the likelihood that the orientation detection is correct. For example, if the number of small orientations is greater than a small orientation threshold, then there is a strong hint that this is “S” texture area rather then a line segment area; if the minimum orientation is larger than a minimum orientation threshold, then there is not likely be a line segmentation area; if the correspondent line smoothness of the minimum orientation is greater than line smoothness threshold, then we do not trust that the minimum orientation is a line segment.
The list below presents example interpolations along a detected orientation. Note that interpolation along the detected orientation is only performed after the conditions are checked as discussed above.
where, ix and ix2 are the corresponding index along the minimum orientation. For 33, 26, . . . 7.5, ix=1, 2, 3, 4, . . . 8; ix2=2, 2, 4, 4, 6, 6, . . . , 8, 8. For −33, −26, . . . −7.5, ix=−1, −2, −3, −4, . . . −8; ix2=−2, −2, −4, −4, −6, −6, . . . , −8, −8.
The spatial interpolation module 202 can perform texture interpolation as presented below:
Tn1=20
Tn2=40
Tn3=30
Tn4=15
d0=|p0−p5|
d1=|p1−p4|
d2=|p2−p3|
X=(P1+P4)/2
X=(q—L+P1+p4+qL−F−A)/2
As discussed above, when the motion map 212 indicates low motion level, the interpolation module 200 applies a blended temporal-spatial interpolation. If the pixel is not in moving according to the motion map 212, then the missing pixel is assigned the value of the same pixel location of the previous field. In boundary areas where the neighborhood is not large enough to applies one or more of the techniques listed above, a simplified version adaptive Bob/Weave can be applied. In particular, if the motion map pixel motion value is larger than 2, then Bob (i.e., vertical interpolation), otherwise, weave (i.e., average of the previous filed pixel and next field pixel).
The post processing module 240 tries to further reduce noise—either from video signal 110 or from de-interlacing process. Therefore, both the original pixels and interpolated pixels can be used in this process. For example, speckle reduction module 242 can operate if the current pixel is different by more than some speckle reduction threshold from any one of its eight neighbors in a 3×3 neighborhood and replace the pixel with the average of its 8 neighbors. Low pass filter module 244 can also be applied to further reduce noise by spatial filtering. The kernel is shown in Table 8 below.
While the description above has focused primarily on deinterlacing the Y component of a video signal 110, the U and V components can be deinterlaced by:
In an embodiment of the present invention, step 400 includes generating an odd field motion map wherein the plurality of pixel motion values are based on a comparison of pixel values in a first odd field of the video signal to a second odd field of the video signal, and generating an even field motion map wherein the plurality of pixel motion values are based on a comparison of pixel values in a first even field of the video signal to a second even field of the video signal. Step 400 can further include generating a plurality of motion flags based on a comparison of pixel values between adjacent fields of the video signal. The a plurality of motion data can be generated by detecting alternating intensity patterns in the adjacent fields of the video signal, wherein the plurality of motion flags are based on the motion data. The motion data can be modified to eliminate isolated motion data, when isolated motion data is contained in the motion data. The motion data can also be modified based on at least one of: a morphological erosion, and a morphological dilation.
Step 400 can include generating the motion map by integrating the odd-field motion map, the even-field motion map and the plurality of motion flags. The plurality of motion flags can each correspond to one of the plurality of pixel motion values and integrating the odd-field motion map, the even-field motion map and the plurality of motion flags can include increasing selected ones of the pixel motion values when the corresponding ones of the plurality of motion flags indicate motion.
Step 402 can generates the first pixel values within an area based on at least one of: an interpolation along a best orientation, an interpolation along a local gradient, and a weighted vertical interpolation. The deinterlaced signal can include a Y component, a U component and a V component and the motion map can be generated based on the Y component.
While particular combinations of various functions and features of the present invention have been expressly described herein, other combinations of these features and functions are possible that are not limited by the particular examples disclosed herein are expressly incorporated in within the scope of the present invention.
As one of ordinary skill in the art will appreciate, the term “substantially” or “approximately”, as may be used herein, provides an industry-accepted tolerance to its corresponding term and/or relativity between items. Such an industry-accepted tolerance ranges from less than one percent to twenty percent and corresponds to, but is not limited to, component values, integrated circuit process variations, temperature variations, rise and fall times, and/or thermal noise. Such relativity between items ranges from a difference of a few percent to magnitude differences. As one of ordinary skill in the art will further appreciate, the term “coupled”, as may be used herein, includes direct coupling and indirect coupling via another component, element, circuit, or module where, for indirect coupling, the intervening component, element, circuit, or module does not modify the information of a signal but may adjust its current level, voltage level, and/or power level. As one of ordinary skill in the art will also appreciate, inferred coupling (i.e., where one element is coupled to another element by inference) includes direct and indirect coupling between two elements in the same manner as “coupled”. As one of ordinary skill in the art will further appreciate, the term “compares favorably”, as may be used herein, indicates that a comparison between two or more elements, items, signals, etc., provides a desired relationship. For example, when the desired relationship is that signal 1 has a greater magnitude than signal 2, a favorable comparison may be achieved when the magnitude of signal 1 is greater than that of signal 2 or when the magnitude of signal 2 is less than that of signal 1.
As the term module is used in the description of the various embodiments of the present invention, a module includes a functional block that is implemented in hardware, software, and/or firmware that performs one or module functions such as the processing of an input signal to produce an output signal. As used herein, a module may contain submodules that themselves are modules.
Thus, there has been described herein an apparatus and method, as well as several embodiments including a preferred embodiment, for implementing a deinterlacer. Various embodiments of the present invention herein-described have features that distinguish the present invention from the prior art.
It will be apparent to those skilled in the art that the disclosed invention may be modified in numerous ways and may assume many embodiments other than the preferred forms specifically set out and described above. Accordingly, it is intended by the appended claims to cover all modifications of the invention which fall within the true spirit and scope of the invention.