The system and method of the present invention is directed to a unified metric for controlling digital video post-processing where the metric reflects local picture quality of an MPEG encoded video. More particularly, the system and method of the invention provides a metric that can be used to direct a post-processing system in how much to enhance a pixel or how much to reduce the artifact, thereby achieving optimum quality of the final post-processed result.
Compressed digital video sources have come into modern households through digital terrestrial broadcast, digital cable/satellite, PVR (Personal Video Recorder), DVD, etc. The emerging digital video products are bringing revolutionary experiences to consumers. At the same time, they are also creating new challenges for video processing functions. For example, low bit rates are often chosen to achieve bandwidth efficiency. The lower the bit rates, the more objectionable become the impairments introduced by the compression encoding and decoding processing.
For digital terrestrial television broadcasting of standard-definition video, a bit rate of around 6 Mbit/s is considered a good compromise between picture quality and transmission bandwidth efficiency, see P. N. Tudor, “MPEG-2 Video Compressions,” IEEE Electronics & Communication Engineering Journal, December 1995, pp. 257-264. However, broadcasters sometimes choose bit rates far lower than 6 Mbit/s to have more programs per multiplex. Meanwhile, many processing functions fail to take the digital compression into account. As a result, they may perform sub-optimally on the compressed digital video.
MPEG-2 has been widely adopted as a digital video compression standard, and is the basis of new digital television services. Metrics for directing individual MPEG-2 post-processing techniques have been developed. For example, in Y. Yang and L. Boroczky, “A New Enhancement Method for Digital Video Applications”, IEEE Transactions on Consumer Electronics, Vol. 48, No. 3, August 2002, pp. 435-443, the entire contents of which are hereby incorporated by reference as if fully set forth herein, the inventors define a usefulness metric (UME: Usefulness Metric for Enhancement) for improving the performance of sharpness enhancement algorithms for post-processing of decoded compressed digital video. However, a complete digital video post-processing system must include not only sharpness enhancement but also resolution enhancement and artifact reduction. UME's and other metrics' focus on sharpness enhancement alone limits their usefulness.
Picture quality is one of the most important aspects for digital video products (e.g., DTV, DVD, DVD record, etc.). These products receive and/or store video resources in MPEG-2 format. The MPEG-2 compression standard employs a block-based DCT transform and is a lossy compression that can result in coding artifacts that reduce picture quality. The most common and visible of these coding artifacts are blockiness and ringing. Among the video post-processing functions performed in these products, sharpness enhancement and MPEG-2 artifact reduction are the two key functions for quality improvement. It is extremely important for these two functions not to cancel out each other's effects. For instance, MPEG-2 blocking artifact reduction tends to blur the picture while sharpness enhancement makes the picture sharper. If the interaction between these two functions is ignored, the end result may be to restore the blocking effect by the sharpness enhancement even though the early blocking artifact reduction operation reduced the block effect.
Blockiness manifests itself as visible discontinuities at block boundaries due to the independent coding of adjacent blocks. Ringing is most evident along high contrast edges in areas of generally smooth texture and appears as ripples extending outwards from the edge. Ringing is caused by abrupt truncation of high frequency DCT components, which play significant roles in the representation of an edge.
No current metric is designed to direct the joint application of enhancement and artifact reduction algorithms during post-processing.
Thus, there is a need for a metric which can be used to direct post-processing that effectively combines quality improvement functions so that total quality is increased and negative interactions are reduced. The system and method of the present invention provides a metric for directing the integration and optimization of a plurality of post-processing functions, such as, sharpness enhancement, resolution enhancement and artifact reduction. This metric is A Unified Metric for Digital Video Processing (UMDVP) that can be used to jointly control a plurality of post-processing techniques.
UMDVP is designed as a metric based on the MPEG-2 coding information.
UMDVP quantifies how much a pixel can be enhanced without boosting coding artifacts. In addition, UMDVP provides information about where artifact reduction functions should be carried out and how much reduction needs to be done. By way of example and not limitation, in a preferred embodiment, two coding parameters are used as a basis for UMDVP: the quantisation parameter (q_scale) and the number of bits spent to code a luminance block (num_bits). More specifically, num_bits is defined as the number of bits spent to code the AC coefficients of the DCT block. q_scale is the quantization for each 16×16 macroblock and can be easily extracted from every bitstream. Furthermore, while decoding a bitstream, num_bits can be calculated for each 8×8 block with little computational cost. Thus, the overall overhead cost of collecting the coding information is negligible.
a illustrates a snapshot from a “Calendar” video sequence encoded at 4 Mbits/s.
b illustrates an enlargement of an area of
a illustrates a snapshot from a “Table-tennis” sequence encoded at 1.5 Mbits/s.
b illustrates an enlargement of an area of
a illustrates a horizontal edge, according to an embodiment of the present invention.
b illustrates a vertical edge, according to an embodiment of the present invention.
c and 3d illustrate diagonal edges for 45 and 135 degrees, according to an embodiment of the present invention.
The relationship between picture quality of compressed digital video sources and coding information is well known, i.e., picture quality of a compressed digital video is directly affected by how it has been encoded. The UMDVP metric of the present invention is based on the MPEG-2 coding information and quantifies how much a pixel can be enhanced without boosting coding artifacts. In addition, it can also point out where artifact reduction functions should be carried out and how much reduction needs to be done.
1. Unified Metric for Digital Video Processing (UMDVP)
UMDVP uses the coding information such as the quantisation parameter (q_scale) and the number of bits spent to code a luminance block (num_bits). q_scale is the quantisation scale for each 16×16 macroblock. Both are easily extracted from every bitstream.
1.1 Quantisation Scale (q_scale)
MPEG schemes (MPEG-1, MPEG-2 and MPEG-1) use quantisation of the DCT coefficients as one of the compression steps. But, quantisation inevitably introduces errors. The representation of every 8×8 block can be considered as a carefully balanced aggregate of each of the DCT basis images. Therefore a high quantisation error may result in errors in the contribution made by the high-frequency DCT basis images. Since the high-frequency basis images play a significant role in the representation of an edge, the reconstruction of the block will include high-frequency irregularities such as ringing artifacts.
The larger the value of q_scale the higher is the quantisation error. Therefore, UMDVP is designed to increase as q_scale decreases.
1.2 The Number of Bits to Code a Block (num_bits)
MPEG-2 uses a block-based coding technique with a block-size of 8 by 8. Generally, the fewer bits used to encode a block the more information of the block that is lost and the lower the quality of the reconstructed block. However, this quantity is also highly dependent on scene content, bit rate, frame type (such as I, P and B frames), motion estimation, and motion compensation.
For a non-smooth area, if num_bits becomes 0 for an intra-block, it implies that only the DC coefficient remains while all AC coefficients are absent. After decoding, blocking effects may exist around this region.
The smaller num_bits, the more likely coding artifacts exist. As a result, the UMDVP value is designed to decrease as num_bits decreases.
1.3 Local Spatial Feature
Picture quality in an MPEG-based system is dependent on both the available bit rate and the content of the program being shown. The two coding parameters: q_scale and num_bits only reveal information about the bit rate. The present invention defines another quantity to reflect the picture content. In the present invention, a local spatial feature quantity is defined as an edge-dependent local variance used in the definition of UMDVP.
1.3.1 Edge Detection
Before calculating this local variance at pixel (i,j), it must be determined if the pixel(i,j) belongs to an edge. If it does, the edge direction is determined. The present invention only considers three kinds of edges, as shown in
When pixel (i,j) belongs to a horizontal edge, the edge-dependent local variance is defined as:
When pixel (i,j) belongs to a vertical edge, the edge-dependent local variance is defined as:
When pixel(i,j) belongs to a diagonal edge, the edge-dependent local variance is defined as:
When pixel(i,j) does not belong to any of the aforementioned edges, the variance is defined as:
The edge-dependent local variance reflects the local scene content of the picture. This spatial feature is used in the present invention to adjust and refine the UMDVP metric.
1.4 Definition of UMDVP
By way of example and not limitation, UMDVP can be defined based on observations of the two coding parameters (num_bits and q_scale), as the following function:
where Q_OFFSET is an experimentally determined value. By way of example and not limitation, Q_OFFSET can be determined by analyzing the bitstream while taking quality objectives into account. A value of 3 is used for Q_OFFSET in a preferred embodiment of the present invention. The UMDVP value is limited to the range of [−1,1]. If num_bits equals to 0, UMDVP is set to 0. Taking the local spatial feature into account, the UMDVP value is further adjusted as follows:
UMDVP=UMDVP+1 if ((UMDVP<0)&(var>VAR_THRED)) (10)
where VAR_THRED is a pre-determined threshold that is empirically determined. By way of example and not limitation, VAR_THRED can be determined by analyzing the bit stream while taking quality objectives into consideration.
The value of UMDVP is further refined by the edge-dependent local variance:
Here again, the UMDVP value is limited to the range between −1 and 1, inclusive. A value of 1 for UMDVP means that sharpness enhancement is absolutely allowed for a particular pixel, while if the value is −1, the pixel can not be enhanced and artifact reduction operations are needed.
2. UMDVP Calculation For MPEG-2 Video
The UMDVP metric is calculated differently depending on whether the frame is an I-frame, P-frame or B-frame. Motion estimation is employed to ensure temporal consistency of the UMDVP, which is essential to achieve temporal consistency of enhancement and artifact reduction. Dramatic scene change detection is also employed to further improve the performance of the algorithm. The system diagram of the UMDVP calculation for MPEG-2 video is illustrated in
2.1 Motion Estimation (55)
By way of example and not limitation, an embodiment of the present invention employs a 3D recursive motion estimation model described in Gerard de Haan et al, “True-Motion Estimation with 3-d Recursive Search Block Matching”, IEEE Transactions on Circuits and Systems for Video Technology, Vol. 3, No. 5, October 1993, pp 368-379, the entire contents of which are hereby incorporated by reference as if fully set forth herein. Compared with a block-based full-search technique, this 3D model dramatically reduces the computational complexity while improving the consistency of motion vectors.
2.2 Scene Change Detection (53)
Scene change detection is an important step in the calculation of the UMDVP metric, as a forced temporal consistency between different scenes can result in picture quality degradation, especially if dramatic scene change occurs.
The goal of scene change detection is to detect the content change of consecutive frames in a video sequence. Accurate scene change detection can improve the performance of video processing algorithms. For instance, it is used by video enhancement algorithms to adjust parameters for different scene content. Scene change detection is also useful in video compression algorithms.
Scene change detection may be incorporated as a further step in the UMDVP calculation, as a forced temporal consistency between different scenes can result in picture quality degradation, especially if a dramatic scene change occurs.
Any known scene change detection method can be used. By way of example and not limitation, in a preferred embodiment, a histogram of the differences between consecutive frames is examined to determine if a majority of the difference values exceed a predetermined value.
2.3 UMDVP Calculation for I, P and B Frames (54) & (56)
The interpolation scheme is illustrated in
UMDVP=(1−β)×((1−α)×UMVP1+×αUMDVP3)+β×((1−α)×UMDVP2+α×UMDVP4) (12)
At step 65, the value of the UMDVP metric is adjusted based on the calculated value of the UMDVP metric at step 61 or the interpolated value of the UMDVP metric and the value of the UMDVP metric at the location pointed at by (v′,h′) in the previous frame and, in a preferred embodiment, R1 is set to 0.7 to put more weight on the calculated value of the UMDVP metric
UMDVP=R1×UMDVP+(1−R1)×UMDVP—prev(v′,h′) (13)
The final block “UMDVP refinement” 58 in
The UMDVP memory 57 is used to store intermediate results.
2.4 UMDVP Scaling
If the video processing algorithm runs not on the original resolution but on some higher resolution, scaling functions are needed for the UMDVP map to align with the new resolution. Vertical and horizontal scaling functions may be required for UMDVP alignment
2.4.1 Vertical Scaling
In
2.4.2 Horizontal Scaling
In
3. Sharpness Enhancement Using UMDVP for MPEG-2 Encoded Video
By way of example and not limitation, sharpness enhancement algorithms attempt to increase the subjective perception of sharpness for a picture. However, the MPEG-2 encoding process may introduce coding artifacts. If an algorithm does not take the coding information into account, it may boost the coding artifacts.
By contrast, by using the UMDVP metric it is possible to instruct an enhancement algorithm as to how much to enhance the picture without boosting artifacts.
3.1 System Diagram
3.2 Sharpness Enhancement
Sharpness enhancement techniques include peaking and transient improvement Peaking is a linear operation that uses, for example, in a preferred embodiment, the well-known “Mach Band” effect to improve the sharpness impression. Transient improvement, e.g. luminance transient improvement (LTI) is a well-known non-linear approach that modifies the gradient of the edges to enhance the sharpness.
3.2.1 Integration of the UMDVP Metric and Peaking Algorithms
Peaking increases the amplitude of the high-band, and/or middle-band frequency using linear filtering methods, usually one or several FIR-filters.
A straightforward method of applying the UMDVP metric 130 to peaking algorithms is to use the UMDVP metric to control how much enhancement is added to the original signal.
When the value of the UMDVP metric is larger than 0.3, it is increased by 0.5. The assumption here is that if the value UMDVP metric is above some threshold (0.3 in this case), the picture quality is good enough so that sharpness enhancement should not be oversuppressed.
A specific example of sharpness enhancement using the UMDVP metric
By way of example and not limitation, the approach described in G. de Haan, Video Processing for Multimedia Systems, University Press, Eindhoven, The Netherlands, 2000, allows peaking at two parts of the signal spectrum, typically taken at a half and at a quarter of the sampling frequency.
Let f({right arrow over (x)}, n) be the luminance signal at pixel position {right arrow over (x)}=(x,y) in picture n. Using the z-transform, we can describe the peaked luminance signal fp({right arrow over (x)}, n), as:
where k1 141 and k2 142 are control parameters determining the amount of peaking at the middle and the highest possible frequencies, respectively.
To prevent noise degradation, a common remedy is to only boost the signal components if they exceed a pre-determined amplitude threshold. This technique is known as ‘coring’ 140 and can be seen as a modification of k1 and k2 in Eq.(15).
The peaking algorithm described above enhances the subjective perception of sharpness, but at the same time it can also enhance the coding artifacts. To prevent this problem, the UMDVP metric 150 can be used to control the peaking algorithm as shown in
Both enhancement and artifact reduction functions are required to achieve an overall optimum result for compressed digital video. The balance between enhancement and artifact reduction for digital video is analogous to the balance between enhancement and noise reduction for analog video. The optimization of the overall system is not trivial. However, UMDVP can be used both for enhancement algorithms and artifact reduction functions.
The methods and systems of the present invention, as described above and shown in the drawings, provide for a UMDVP metric to jointly control enhancement and artifact reduction of a digital coded video signal. It will be apparent to those skilled in the art that various modifications and variations can be made in the method and system of the present invention without departing from the spirit or scope of the invention. Thus, it is intended that the present invention includes modifications and variations that are within the scope of the appended claims and their equivalents.
Number | Date | Country | Kind |
---|---|---|---|
60432307 | Dec 2002 | US | national |
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/IB03/05717 | 12/4/2003 | WO | 6/9/2005 |