These and other features, aspects, and advantages of the present invention will become better understood with regard to the following description, appended claims, and accompanying drawings where:
In the following description, numerous specific details are set forth to provide a thorough understanding of the present invention. The present invention may be practiced without these specific details. The description and representation herein are the means used by those experienced or skilled in the art to effectively convey the substance of their work to others skilled in the art. In other instances, well-known methods, procedures, components, and circuitry have not been described in detail since they are already well understood and to avoid unnecessarily obscuring aspects of the present invention.
Reference herein to “one embodiment” or “an embodiment” means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one implementation of the invention. The appearances of the phrase “in one embodiment” in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Further, the order of blocks in process, flowcharts or functional diagrams representing one or more embodiments do not inherently indicate any particular order nor imply limitations in the invention.
As used herein, the singular forms “a”, “an”, and “the” are intended to include the plural forms as well, unless the context indicates otherwise. It will be further understood that the terms “comprises” and/or “comprising” specify the presence of stated features, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, steps, operations, elements, components, and/or groups thereof.
According to one exemplary embodiment of the present invention as shown in
In general, there are three components in a video signal, such as RGB or YCbCr, which means each pixel comprising three component values. The pixel value referred herein represents one of the components, such as a brightness component Y, a color component value Cr or Cb. To facilitate the description of the present invention, unless otherwise stated, the following description is based on the brightness component Y, as an example, the same shall be applicable to other two components.
According to one aspect of the present invention, an edge detection operation is performed as follows: 1) calculating pixel difference flags for a plurality of slanted modes when performing interpolation of two adjacent rows in an interlaced video image or field; 2) calculating a weighted score for each slanted direction mode direction in step 1 using a target pixel as a center point; and 3) selecting one of the slanted directions by determining whether the object edge crossed the target pixel using the weighted scores in each slanted direction for the target pixel and the object edge angel resulting in one of interpolation schemes.
With a chose direction for calculating pixel difference and the interpolation scheme, a final interpolated image (i.e., de-interlaced image) can be obtained with the next two steps: obtaining the target pixel value by interpolating existing pixels using the selected direction for calculating pixel difference and the interpolation scheme; and finally checking the obtained target pixel to determine whether a bad point has been introduced by an incorrect interpolation; if so, erasing the bad point to further improve the quality of de-interlaced video image.
Using the 18 slanted modes example of
Threshold value or tolerance EAIT[n] is used for determining whether two pixels in a particular slanted direction are the same or nearly the same. Absolute value of the pixel difference is used here to ensure the comparison is meaningful. Generally, pixel value is represented by an unsigned 4- to 16-bit number. The most commonly used pixel value is represented by an 8-bit integer whose value range is 0-255. When absolute value range of pixel difference is between 0 and 255, the tolerance EAIT[n] should be selected in the range between 16 and 64. The same ratio should be used when other sized integer is used for representing pixel value. For example, if pixel value is represented by one additional bit, then EAIT[n] range should be doubled. Or if pixel value is represented by one less bit, EAIT[n] range should be divided by 2. When the tolerance is set very tight, detecting object edge crossing the target pixel is more accurate. However, in the flip side, some real edge would be missed because of the tight tolerance. In addition, in the present invention, the tolerance may be set to a different value in different direction. If tolerance is set to different numbers, the slanted direction with smaller slanted angle should have a tighter tolerance, that is EAIT[1]≧EAIT[2]≧, . . . ≧, EAIT[9].
In an interlaced video image with each row including N pixels (i.e., ImageWidth), P(x,y) represents a pixel located in row x column y, where x and y starts from 0. Row y represents the omitted pixel row, row (y−1) and row (y+1) represent the existed pixel rows. Abs( ) means to the absolute value, EAIT[n] represents 9-pair of tolerances, Flag_Ln and Flag_Rn, n=1, 2, . . . , 9 means 9-pair of pixel difference flags, each for one of the 18 slanted modes. For each pixel in row y, pixel difference flag for the Ln mode is calculated by the following algorithm in pseudo-code:
Pixel difference flag for the Rn mode is:
Next, at 302, the process 300 calculates a weighted score for each of the slanted direction modes using calculated pixel difference flags for one of the omitted pixels (i.e., target pixel). It is evident that those slanted direction modes with pixel difference flag set to zero are not required for any calculations.
To demonstrate how a weighted score of an interpolation mode is calculated, the L1 slanted direction mode is used in an example as shown is
To further demonstrate how the weight score is calculated,
In order to include more relevant pixel information for generating the target pixel 508, one of the better interpolation schemes is to select the reference area including 4 pixel rows (2 above and 2 below the target pixel). To calculate the weight score using pixel difference flag of each pixel in the parallel direction of the L1 slanted direction, a set of positive weighting factors or coefficients is assigned. For the direction perpendicular or nearly perpendicular to the direction of the L1 slanted direction, and crossing the center line 502, a set of negative weight coefficients is assigned. In this example, these nearly perpendicular directions include R2, R3 and/or R4 modes. Other variations that comprise different size of the reference area may lead to different modes included in the calculation of weighted score.
As shown in
The mathematical equation for sum1 of Table 1 can be written as follows:
sum1=1×Flag—L1[x−3][y−2]+3×Flag—L1[x−2][y−2]
+3×Flag—L1[x−1][y−2]+1×Flag—L1[x][y−2]−1×Flag—L1[x−1][y−2]
−2×Flag—L1[x][y−2]−1×Flag—L1[x+1][y−2]
Equations for sum2 and sum3 can be written similarly. Three distinct weighted scores, Score[1][0], Score[1][1], Score[1][2] in the L1 slanted direction mode are calculated as follows:
Score[1][0]=sum1+sum2+sum3
Score[1][1]=sum1+sum2
Score[1][2]=sum2+sum3
Score[1][0] is a weighted score of the summation of the three rows, while Score[1][1] and Score[1][2] are weighted scores of the sum of the upper 2 rows and the lower 2 rows, respectively. When the target pixel is located on the upper slant edge, row (y−2) pixel difference flag would not match edge character, under this condition, only Score[1][2] would represent object slant edge character; When target pixel is located on slant lower edge, only Score[1][1] would represent object slant edge character; Under other conditions, Score[1][0] is commonly used to represent object slant edge character, for Score [1][0] contain the most pixel information.
Using the method described above for the L1 slanted direction mode, weighted scores of all of the other modes can be calculated. Tables 2 to 9 list the weighted scores of the L2 to L9 modes, respectively. All weighted scores are calculated for 4 rows and M columns (omitted pixels to be generated are shown but not counted).
From the data of Table 1 to Table 9, the following regularities are observed:
Referring back to
303
a) choose a largest score Score[kmax][0] from 18 scores, Score[k][0] (k=1, 2, . . . , 18), and determine whether Score[kmax][0] is larger than AES[0], if true, the process 300 performs operation 303b, otherwise performs operation 303c.
303b) if the slanted direction mode of Score[kmax][0] is left slanted direction mode (i.e., one of the Ln modes), then determine whether Score[kmax][0] and all of the right slant mode weighted scores Score[2n][0](n=1, 2, . . . , 9) are larger than RES[0]. If true, mode direction of Score[kmax][0] is the object edge where target pixel is located, choose this interpolation scheme. Otherwise choose vertical column interpolation scheme. The process 300 finishes the operation at 303 at this point.
Similarly, if the slanted direction mode of Score[kmax][0] is one of the Rn modes, then determine whether Score[kmax][0] and all of the left slant mode weighted scores Score[2n−1][0](n=1, 2, . . . , 9) are larger than RES[0]. If true, mode direction of Score[kmax][0] is the object edge where target pixel is located, choose this interpolation direction scheme. Otherwise choose vertical column interpolation scheme. The process 300 finishes operation 303 at this point.
303c) select a largest score, Score[kmax][1] from 18 scores, Score[k][1] (k=1, 2, . . . , 18), and determine whether Score[kmax][1] is larger than AES[1]. If true, the process 300 moves to operation 303d, otherwise operation 303e.
303d) if the slanted direction mode of Score[kmax][1] is one of the Ln modes, then determine if Score[kmax][1] and all of the left slant mode scores Score[2n][1](n=1, 2, . . . , 9) are larger than RES[1]. If true, corresponding mode direction of Score[kmax][1] is the object edge direction where target pixel is located, choose this interpolation scheme. Otherwise choose vertical column interpolation scheme. The process 300 finishes operation 303 at this point;
If slanted direction mode of Score[kmax][1] is one of the Rn modes, then determine if Score[kmax][1] and all of left slanted mode scores Score[2n−1][1](n=1, 2, . . . , 9) are larger than RES[1]. If true, corresponding mode direction of Score[kmax][1] is the object edge direction where target pixel is located, choose this interpolation direction scheme. Otherwise choose vertical column interpolation scheme. The process 300 finishes operation 303 at this point.
303e) select a largest score, Score[kmax][2] from 18 scores, Score[k][2] (k=1, 2, . . . , 18), and determine if a) Score[kmax][2] is larger than AES[2]; and b) one of the Ln/Rn modes of Score[kmax][2], Score[kmax][2] and corresponding 9 scores of Rn/Ln (n=1, 2, . . . , 9) are larger than RES[2]. If true, Score[kmax][2] is the object edge where target pixel is located, choose this interpolation scheme. Otherwise choose vertical column interpolation scheme. The process 300 finishes operation 303 at this point.
Score[k][1] and Score[k][2] are symmetric to each other, therefore AES[1]=AES[2], RES[1]=RES[2]. All the operations above, when Score[kmax][0] is less than or equal to AES[0], Score[k][1] and Score[k][2] determination sequence can be exchanged, that is to determine Score[k][2] first. When Score[kmax][2] is less than or equal to AES[2], Score[k][1] is then determined.
AES[0], AES[1] and AES[2] have a range of about 50% to 75% of the upper limit, which is the most likely value among Score[k][0], Score[k][1] and Score[k][2]. In other words, the most likely value when positive weighting coefficient having flag equal to 1 and negative weighting coefficient with flag equal to 0 in Tables 1 to 9. AES[0] has a practical range of 16 to 24 with an upper limit 32. AES[1] and AES[2], by only calculating upper two rows or lower two rows weighting coefficients, have an upper limit of 24 in Tables 1-4 and 26 in Tables 5-9, hence 25 is chosen as the upper limit. The range is between 12 and 19 accordingly. RES[0], RES[1] and RES[2] should use 25% to 50% of the upper limit as their range. Hence, AES[0] has a range of 8 to 16 with an upper limit 32; AES[1] and AES[2]: (by only calculating upper two rows or lower two rows weighting coefficients), the upper limit is 24 in Tables 1 to 4 and 26 in Tables 5 to 9, the corresponding range is 6 to 12; Higher AES and RES values result into a stricter edge determination, which means to reduce edge determination sensitivity and raise the edge determination accuracy.
There are 19 interpolation schemes including one vertical direction and 18 slanted directions interpolation schemes,
After the direction of the object edge been determined and the interpolation scheme has been selected, the process 300 moves to decision 304 to generate the target pixel using the selected interpolation scheme. The process 300 determines whether the selected interpolation scheme is an interpolation in the vertical direction of the target pixel (i.e., the same column). If yes, the process 300 moves to 306, otherwise the process 300 follows the “no” branch to 305.
At 305 the process 300 determines whether the generated target pixel value is within the range of the pair of upper and lower row pixels in the same column. If true, do nothing. Otherwise interpolate the vertical pair of upper and lower pixels to obtain the target pixel value.
In certain complex scenery field, incorrect determination is hard to avoid. The resulting bad point in a slanted direction is so much different from the surrounding pixels, which leads to image quality. The objective of 305 is to keep properly generated target pixel while the bad point is corrected with a corrective action.
For example, the process 300 determines which one of the vertical pair of pixels, P(x,y−1) and P(x,y+1) is larger. The larger value is set as P_max and the other set as P_min. A range tolerance, POST_RANGE, is set to a value larger or equal to 0. If target pixel value is larger than (P_max+POST_RANGE) or less than (P_min-POST_RANGE), the target pixel is a bad point and needs to be replaced by vertical column interpolation scheme as (P(x,y−1)+P(x,y+1))/2. Otherwise the generated target pixel value is good and should be kept. A smaller POST_RANGE leads to more corrections and it may even make corrections to properly generated or interpolated target pixels. Therefore, POST_RANGE needs to be selected properly. If pixel value is represented by an 8-bit integer, the POST_RANGE should be between 0 and 64.
Next at decision 306, the process 300 determines whether there is another target pixel in the field. If true, the process 300 moves back to 302, otherwise the process 300 moves to another decision 307.
At 307, the process 300 sends out a de-interlaced image from the interlaced video image, then determines if there is another interlaced video image to be processed. If true, the process 300 moves back to 301, otherwise the process 300 ends.
In an alternative embodiment, for each n modes (n=1, 2 . . . , 9) at 301, the process 300 may set several threshold EAIT[n]1, EAIT[n]2, . . . , EAIT[n]m, where m is larger than 1, the pixel difference flag width needs to be increased to cover larger number. For example, whereas three tolerances are set as follows: EAIT[n]1<EAIT[n]2<EAIT[n]3, the pixel difference flag width needs to be 2-bit. If the absolute value of the pixel difference<EAIT[n]1, the corresponding pixel difference flag is set to 3. If EAIT[n]1≦the absolute value of the pixel difference<EAIT[n]2, then pixel difference flag is 2. If EAIT[n]2≦the absolute value of pixel difference<EAIT[n]3, the pixel difference flag is 1; if the absolute value of pixel difference≧EAIT[n]3, the pixel difference flag is set to 0. Although pixel difference flag storage space is twice as original, it is much less than the space to store pixel itself. More importantly, pixel difference flag can distinguish more delicately of pixel difference information. Weighted scores for the object edge can be much differentiated from other scores, which help to detect object edge direction correctly and reduce mistakes.
Due to the increase of pixel difference flag value, the range of weighted scores for each direction will be increased. For example, if bit width were increased from 1-bit to 2-bit, the range of the weighted scores would increase to 3 times of the original value to 144 (from −48 to 96). AES[0], AES[1] and AES[2] and RES[0], RES[1] and RES[2] are also increased to 3 times of the original values. Accordingly, the ratio between weighting coefficients for small slanted angle directions and for the large slanted angle directions should not be reduced by more than 18=144/8.
Referring now to
The device 1402 comprises an edge detection module 1403, a pixel interpolation module 1404, and an EAI post-processing module 1405. The edgy detection module 1403 is configured for performing edge detection of a target pixel in an interlaced video image received from the digital video storage 1401. The edgy detection module 1403 is also configured for determining the direction of object edge at target pixel, and for selecting most appropriate interpolation scheme based on the determined direction. The selected interpolation scheme information is passed to the pixel interpolation module 1404.
The pixel interpolation module 1404 is configured to generate target pixels by interpolating existing pixels using the selected interpolation scheme and to form a de-interlaced video image. The de-interlaced image is sent to the EAI post-processing module 1405, which is configured to determine whether each generated target pixel is properly processed based on one or more pre-defined tolerance. If necessary, improperly interpolated target pixel (i.e., bad point) is replaced with a vertical column interpolation scheme.
The edge detection module 1403 further comprises an EAI pre-processing module 1406, an edge filter group 1407 and an edgy direction arbiter 1408. the EAI pre-processing module 1406 is configured for storing pre-determined slanted direction mode Ln/Rn and tolerances EAIT[n]1 to EAIT[n]m for each of n pixels in a row of omitted pixels, where m and n are integer and greater than or equal to 1; for calculating every slanted direction mode pixel difference of the pre-determined slanted direction mode; for storing pixel difference flag by comparing absolute value of pixel differences and tolerance EAIT[n]m; and for sending out all of the pixel difference flags of the interlaced video image to the edge filter group 1407.
The edge filter group 1407 comprises K (K=2n+1) pairs of structural symmetry edge filters. Each of the edge filters corresponds to one slanted direction mode (Ln/Rn). Function of the edge filter is to select 3 rows of pixel difference flag in a 4-row by N-column area centered by the target pixel. The pixel difference flags are divided into two groups: one is to follow the slanted direction mode, the other is perpendicular or nearly perpendicular to the slanted direction mode. The edge filter group 1407 is also configured to assign a weighting coefficient for each selected pixel difference flag. The value of the weighting coefficient is larger when the location of the existing pixel is closer to the target pixel. The weighting coefficients in the parallel direction are positive, negative in perpendicular or nearly perpendicular direction. The weighted scores of each slanted direction mode is equal or nearly equal. The edge filer calculates the selected 3 rows pixel difference flag weighted scores including the total of the three rows, two upper rows pixel and two lower rows. Finally the edge filter group 1407 is configured to sending out 3K weighted scores for 1 K pairs of edge filters to edge direction arbiter 1408.
The edge direction arbiter 1408 is configured for determining whether slanted edge is passing through the target pixel and the edge direction based on the received weighted scores from the edge filter group 1407. The edge direction arbiter 1408 is also configured for determining which one of the interpolation schemes to use. That information (i.e., serial number of the interpolation scheme) is sent to the pixel interpolation module 1404.
The present invention has been described in sufficient details with a certain degree of particularity. It is understood to those skilled in the art that the present disclosure of embodiments has been made by way of examples only and that numerous changes in the arrangement and combination of parts may be resorted without departing from the spirit and scope of the invention as claimed. Accordingly, the scope of the present invention is defined by the appended claims rather than the foregoing description of embodiments.
Number | Date | Country | Kind |
---|---|---|---|
200610098849.2 | Jul 2006 | CN | national |