OBJECTIVE PERCEPTUAL VIDEO QUALITY EVALUATION APPARATUS

Description

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a block diagram showing a schematic configuration of an objective perceptual video quality evaluation apparatus according to an embodiment of the present invention;

FIG. 2 is a schematic diagram explaining a method of calculating a block DC difference;

FIG. 3 is a graph showing an example of a linear characteristic;

FIG. 4 is a graph showing a sigmoid function that is another example of the linear characteristic;

FIG. 5 is a schematic diagram showing definition of PSNR local degradation;

FIG. 6 is a graph showing a characteristic of objective evaluation index to subjective evaluation index for every frame rate;

FIG. 7 is a graph showing a method of correcting frame rate of the characteristic of objective evaluation index to subjective evaluation index;

FIG. 8 is a graph showing a characteristic of objective evaluation index to subjective evaluation index after frame rate correction;

FIG. 9 is a graph showing a regression curve for every frame rate set; and

FIG. 10 is a graph showing the characteristic of objective evaluation index to subjective evaluation index after the frame rate is corrected so that data sets of frame rates are on the same line.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT

Preferred embodiments of the present invention will be described hereinafter in detail with reference to the accompanying drawings. FIG. 1 is a block diagram of an automatic objective perceptual video quality evaluation apparatus according to an embodiment of the present invention. The automatic objective perceptual video quality evaluation apparatus receives two types of video signals, and analyzing the video signals, thereby finally outputting a subjective video quality estimated value. Meanwhile, an image corresponding to a video signal before being subjected to video-transmission-related image processing is denoted by “original video x”, and an image corresponding to a received transmission image and to be subjected to evaluation of a subjective quality according to the present invention is denoted by “evaluated video y”.

As shown in FIG. 1, the automatic objective perceptual video quality evaluation apparatus according to the embodiment is configured to include a feature amount extracting unit 1, a weighted sum calculating unit (or an objective perceptual video quality index calculating unit) 2, frame rate detecting unit 3, frame rate-specific correcting unit 4, and an objective evaluation index-subjective video quality mapping unit (or a subjective perceptual video quality estimated value deriving unit) 5. The feature amount extracting unit 1 is divided into functional units of a block distortion degree calculating unit 11, a PSNR (pixel to noise ratio) overall temporal fluctuation degree calculating unit 12, and a PSNR local temporal fluctuation degree calculating unit 13.

A configuration or function of each of the constituent elements of the automatic objective perceptual video quality evaluation apparatus according to the embodiment will be described in detail.

The feature amount extracting unit 1 extracts three video feature amounts necessary to derive a subjective video quality, that is, a block distortion degree P₁, a PSNR overall temporal fluctuation degree P₂, and a PSNR local temporal fluctuation degree P₃. A method of deriving each of the video feature amounts will be described.

1. Block Distortion Degree P₁

The block distortion degree calculating unit 11 calculates an intra-frame average dDC(f) of a DC difference between a pixel block 21 of an arbitrary size shown in FIG. 2 (8×8 pixel block in FIG. 2) and four adjacent blocks (a neighboring pixel block 25 on the right hand, a pixel block 22 on the lower left hand, a pixel block 23 under the pixel block 21, and a pixel block 24 on the lower right hand) for each of the original video x and the evaluated video y. Further, the block distortion degree calculating unit 11 calculates a difference between the intra-frame averages dDC(f) for the original video x and the evaluated video y, calculates a difference between an intra-sequence maximum value and an intra-sequence minimum value of the difference, and defines the difference as the block distortion degree P₁. In the present specification, the term “sequence” means entirety of the original video x or evaluated video y used for a video quality evaluation, and the video quality evaluation is generally made on video for 5 to 15 seconds.

P
₁=max{dDC_Ref(f)−dDC_Cod(f)}−min{dDCK_Ref(f)−dDC_Cod(f)}

In the equation, dDC_Ref(f) denotes the intra-frame average of the DC difference for the original video x, and dDC_Cod(f) denotes the intra-frame average of the DC difference for the evaluated video y. In the example shown in FIG. 2, the intra-frame average of the DC difference dDC(f) can be represented by the following Equation (1). In the Equation (1), N_Bdenotes a total of pixel blocks in a frame.

$\begin{matrix} dDC (j) = \sum_{b \in frame} \sum_{i \in Adj (b)} \langle DC (b) - DC (i) \rangle / 4 / N_{B} & Equation (1) \end{matrix}$

2. PSNR Overall Temporal Fluctuation Degree P₂

The PSNR overall temporal fluctuation degree P₂is calculated using a maximum value, a minimum value, and an average value of an intra-sequence power error MSE (mean square error).

First, the maximum value, the minimum value, and the average value of the MSE between the original video x and the evaluated video y are defined. If the maximum value, the minimum value, and the average value of the MSE are denoted by e_min, e_max, and e_ave, respectively, they are defined as represented by the following Equation (2).

$\begin{matrix} MSE (f) = \sum_{n \in frame} {x_{Ref} (f, n) - xCod (f, n)} / N_{P} e_{\min} = \min {MSE (f) | f \in sequence} e_{\max} = \max {MSE (f) | f \in sequence} e_{ave} = \sum_{f \in sequence} MSE (f) | N_{F} & Equation (2) \end{matrix}$

In the Equation (2), x(f, n) denotes a signal value of an n^thpixel in the frame f, N_pdenotes the number of pixels in the frame, and N_fdenotes the number of frames in a sequence. For example, if a video quality for ten seconds in which frames are updated 15 times per second is to be evaluated, the number of frames in the sequence is 150. If the sequence of the original video x and that of the evaluated video y differ in frame rate, then corresponding frames are detected by means such as frame matching means, and a PSNR between the corresponding frames is derived.

Next, the PSNR overall temporal fluctuation degree P₂based on the maximum value e_max, the minimum value e_min, and average value e_aveis calculated. As stated, the PSNR is significant information for estimating the subjective video quality. However, it is confirmed that the correlation between the objective video quality index and the subjective video quality tends to decrease if only the intra-sequence average value is used while the video quality has great temporal fluctuation in the sequence. Therefore, the PSNR overall temporal fluctuation degree P₂is defined as represented by the following Equation (3) according to deviations of the maximum value e_maxand the minimum value e_minfrom the average value e_aveof the intra-sequence power error.

$\begin{matrix} P_{2} = \log \langle \frac{e_{\max} - e_{ave}}{e_{ave} - e_{\min}} \rangle \times f (e_{ave}) & Equation (3) \end{matrix}$

In the Equation (3), f(e_ave) denotes a scaling function for changing a value according to the average value e_aveof the intra-sequence average MSE. As to the scaling function f(e_ave), an arbitrary function monotonically increasing in all ranges of the average value e_ave(which are, however, substantially in an range e_ave>0 according to the definition of e_ave) is available. Examples of the scaling function f(e_ave) include following functions.

Linear Characteristic Function

The linear characteristic function is defined as f(e_ave) e_ave. A linear characteristic thereof is that shown in FIG. 3.

Sigmoid Function

The sigmoid function has a characteristic of saturating in a high e_avepart and a low e_avepart. The sigmoid function is defined as represented by the following Equation (4).

$\begin{matrix} f (e_{ave}) = \frac{b_{1}}{1 + e^{- b_{2} (e_{ave} - b_{3})}} + b_{4} & Equation (4) \end{matrix}$

The sigmoid function has a characteristic shown in FIG. 4. In FIG. 4, b₁=10, b₂=1, b₃=25, and b₄=10.

As can be seen from the property that the function f(e_ave) monotonically increases, the following effect can be produced according to a term of the function f(e_ave). If the average value e_aveis small, that is, the average MSE is small and the video quality of the evaluated video is high, the PSNR overall temporal fluctuation degree P₂is decreased. If the average value e_aveis large and the video quality of the evaluated video is low, the PSNR overall temporal fluctuation degree P₂is increased. Furthermore, if the sigmoid function is used as the scaling function f(e_ave), the property of saturating to certain values in regions on both ends shown in FIG. 4 can be added to the linear characteristic of the scaling function. The property may be regarded as a characteristic that reflects the saturation characteristic of human visual perception.

3. PSNR Local Temporal Fluctuation Degree P₃

The low rate coding intended at multimedia applications tends to generate temporally local degradations in PSNR resulting from key frame insertion, scene change, occurrence of a sudden motion or the like. Due to this, degradations in the subjective video quality caused by these local degradations are detected based on the PSNR local temporal fluctuation degree P₃.

As shown in FIG. 5, a V-shaped temporal change dPSNR(f) of the PSNR is calculated, and the dPSNR(f) is defined as the PSNR local temporal fluctuation degree P₃. Specifically, if an index (a frame number) of a frame of interest is f, an absolute value of a difference between an average PSNR of the frames f−1 and f+1 before and after the frame f and the PSNR value of the frame f is defined as the dPSNR(f). A maximum value of the dPSNR(f) in the sequence is calculated and the calculated maximum value is defined as the PSNR local temporal fluctuation degree P₃.

P₃=max{dPSNR(f)|f ε sequence} Equation (5)

The PSNR local temporal fluctuation degree P₃may be multiplied by a scaling function for changing a value according to the MSE of the frame f. As this scaling function, an arbitrary function that monotonically decreases according to the MSE is applicable.

An objective evaluation index Q_objis defined as represented by the following equation using a weighted sum of the above-stated objective evaluation measures P₁, P₂, and P₃.

Q
_obj
=αP
₁
+βP
₂
+γP
₃

In the equation, symbols α, β, and γ denote weight parameters. The weight parameters α, β, and γ are selected so that an estimated error of the objective video quality from the subjective video quality becomes minimum when the objective evaluation index Q_objis subjected to conversion processings by the frame rate-specific correcting unit 4 and the objective evaluation index-subjective video quality mapping unit 5. For example, the weight parameters α, β, and γ can be respectively set to 0.2, 0.4, and 0.004 (α=0.2, ⊖=0.4, and γ=0.004) The weight parameters α, β, and γ may be negative numbers.

The frame rate detecting unit 3 analyzes a video signal of the evaluated video y and outputs its frame rate. According to the present invention, it is premised that frame rate of the original video x is equal to or higher than that of the evaluated video y. Due to this, the frame rate detecting unit 3 detects the frame rate of the evaluated video y, which is lower than the frame rate of the original video x.

The frame rate detecting unit 3 outputs the detected frame rate to the frame rate-specific correcting unit 4.

If a correlation between the objective evaluation index Q_objoutput from the weighted sum calculating unit 2 and the subjective video quality (DMOS) is obtained, the correlation often differs in characteristics among frame rates a, b, c, etc. as shown in FIG. 6. The automatic objective perceptual video quality evaluation apparatus according to the embodiment is required to output a stable evaluation value without relying on the frame rate. Therefore, the frame rate-specific correction unit 4 absorbs the difference among the frame rates in the characteristic of the subjective evaluation measure to the objective evaluation measure using a correction characteristic (see FIG. 1), and corrects the evaluation value to an objective evaluation value irrespective of the frame rates. The correction method will be described below.

As shown in FIG. 7, a pair of data on a subjective evaluation value of video at the frame rate band data on an objective evaluation value approximated to the subjective evaluation value is set as (Qb, DMOSb), which pair is moved onto a characteristic line y=c₀×Q_obj+c₁of the frame rate a, based on which line the objective evaluation value is calculated. In this case, an objective evaluation value Q_ais calculated so as to give the subjective evaluation value DMOSb on the characteristic line of the frame rate a. The objective evaluation value Q_athus obtained is regarded as the corrected objective evaluation value. Namely, a relationship represented by the following equation is obtained.

DMOSb=c
₀
×Q
_a
+c
₁

The corrected objective evaluation value Q_ais represented by the following equation.

Q
_a
=DMOSb/c
₀
−c
₁

Finally, if the relationship between the objective evaluation index Q_objand the subjective evaluation measure DMOS after the frame rate-specific correction is calculated using many samples, the relationship is shown in, for example, FIG. 8. FIG. 8 is a graph showing that the relationship between the objective evaluation index Q_objand the subjective evaluation measure DMOS is calculated using evaluated videos at frame rates of 3 fps (frames per second), 5 fps, 10 fps, and 15 fps. As obvious from FIG. 8, the relationship between the objective evaluation index Q_objand the subjective evaluation measure DMOS can be approximated to a polynomial function.

However, if these pieces of data are classified according to the frame rates, it is understood that data sets are irregular among the frame rates. Therefore, as shown in FIG. 9, if a regression curve is obtained for every frame rate, it is understood that regression curves differ in inclination among the frame rates and that data irregularity thereby occurs. Accordingly, the objective evaluation index Q_objis corrected so that the data sets on all the frame rates are on the same line.

FIG. 10 shows Q_obj-DMOS characteristic after the correction stated above. The relationship between the objective evaluation index Q_objand the subjective evaluation measure DMOS shown in FIG. 10 can be approximated to, for example, a polynomial function represented by the following equation.

DMOS=−0.0035x³+0.1776x²−2.8234x+14.379 (where x=Q_obj)

Therefore, this polynomial function is stored in the objective evaluation index-subjective video quality mapping unit (or the subjective video quality estimated value deriving unit) 5 in advance. The corrected objective video quality index Q_objis applied to the polynomial function, thereby deriving the subjective video quality estimated value. Namely, points on a solid-line curve shown in FIG. 10 indicate the estimated subjective video qualities corresponding to the objective video quality index Q_obj.

As stated so far, according to the present invention, it is possible to estimate the objective video quality of the video at low resolution and low frame rate such as a multimedia video without relaying on subjective human judgment.

Needless to say, the methods of deriving the block distortion degree, the PSNR overall temporal fluctuation degree, and the PSNR local temporal fluctuation degree executed by the feature amount extracting unit 1, and the method of calculating the weighted sum executed by the weighted sum calculating unit 2 are given only for illustrative purposes. The other deriving methods and the other calculation method can be applied to the present invention.

Claims

1. An objective perceptual video quality evaluation apparatus for estimating a subjective video quality by analyzing two types of video signals of an original video and an evaluated video, comprising: a feature amount extracting unit for extracting a block distortion degree of the evaluated video relative to the original video, a PSNR overall temporal fluctuation degree for frames in a sequence, and a PSNR local temporal fluctuation degree for each of the frames as feature amounts;an objective video quality index calculating unit for calculating a weighted sum of the block distortion degree, the PSNR overall temporal fluctuation degree, and the PSNR local temporal fluctuation degree, and calculating an objective video quality index;frame rate detecting unit for detecting frame rate of the evaluated video;a correcting unit for correcting the objective video quality index calculated by the objective video quality index calculating unit based on the frame rate detected by the frame rate detecting unit; anda subjective video quality estimated value deriving unit for deriving a subjective video quality estimated value by applying the objective video quality index corrected by the correcting unit to a correlation between the subjective video quality index and the objective video quality given in advance.
2. The objective perceptual video quality evaluation apparatus according to claim 1, the objective video quality index calculating unitcalculates an intra-frame average value of a DC difference among a plurality of pixel blocks of each of the original video and the evaluated video and pixel blocks adjacent to the pixel block of interest,calculates an intra-sequence maximum difference between the intra-frame average value of the DC difference for the original video and the intra-frame average value of the DC difference for the evaluated video as a maximum value of the DC difference, andcalculates an intra-sequence minimum difference between the intra-frame average value of the DC difference for the original video and the intra-frame average value of the DC difference for the evaluated video as a minimum value of the DC difference, andobtains a difference between the maximum value of the DC difference and the minimum value of the DC difference as the block distortion degree.
3. The objective perceptual video quality evaluation apparatus according to claim 1, the PSNR overall temporal fluctuation degree is derived based on a ratio of an absolute value of a difference between a maximum value and an average value of an MSE of each of the frames in the sequence to an absolute value of a difference between a minimum value and the average value of the MSE of each of the frames in the sequence.
4. The objective perceptual video quality evaluation apparatus according to claim 3, the ratio of the absolute value of the difference between the maximum value and the average value of the MSE of each of the frames in the sequence to the absolute value of the difference between the minimum value and the average value of the MSE of each of the frames in the sequence is multiplied by a scaling function for changing a value according to an average value of the MSE in the sequence
5. The objective perceptual video quality evaluation apparatus according to claim 1, the PSNR local temporal fluctuation degree is an intra-sequence maximum value of a PSNR value of the frame of interest and PSNR values of the adjacent frames before and after the frame of interest.
6. The objective perceptual video quality evaluation apparatus according to claim 5, the PSNR local temporal fluctuation degree is obtained by multiplying the intra-sequence maximum value of the PSNR difference between the adjacent frames by a scaling function for changing a value according to the MSE value of the frame of interest.

Priority Claims (1)

Number	Date	Country	Kind
2006-208091	Jul 2006	JP	national

OBJECTIVE PERCEPTUAL VIDEO QUALITY EVALUATION APPARATUS

Information

Publication Number

Date Filed

Date Published

Inventors

Original Assignees

CPC

US Classifications

International Classifications

Abstract

Description

Claims

Priority Claims (1)