OBJECTIVE PERCEPTUAL VIDEO QUALITY EVALUATION APPARATUS

Abstract
A feature amount extracting unit extracts a block distortion degree of an evaluated video y relative to an original video x, a PSNR overall temporal fluctuation degree, and a PSNR local temporal fluctuation degree as feature amounts. A weighted sum calculating unit calculates a weighted sum of these feature amounts, and calculates an objective video quality index. Frame rate detecting unit detects frame rate of the evaluated video y. A correcting unit corrects the objective video quality index based on the frame rate detected by the frame rate detecting unit. An objective evaluation index-subjective video quality mapping unit applies the corrected objective video quality index Qobj to a correlation between an objective video quality index and a subjective video quality given in advance, thereby deriving a subjective video quality estimated value DMOS.
Description

BRIEF DESCRIPTION OF THE DRAWINGS


FIG. 1 is a block diagram showing a schematic configuration of an objective perceptual video quality evaluation apparatus according to an embodiment of the present invention;



FIG. 2 is a schematic diagram explaining a method of calculating a block DC difference;



FIG. 3 is a graph showing an example of a linear characteristic;



FIG. 4 is a graph showing a sigmoid function that is another example of the linear characteristic;



FIG. 5 is a schematic diagram showing definition of PSNR local degradation;



FIG. 6 is a graph showing a characteristic of objective evaluation index to subjective evaluation index for every frame rate;



FIG. 7 is a graph showing a method of correcting frame rate of the characteristic of objective evaluation index to subjective evaluation index;



FIG. 8 is a graph showing a characteristic of objective evaluation index to subjective evaluation index after frame rate correction;



FIG. 9 is a graph showing a regression curve for every frame rate set; and



FIG. 10 is a graph showing the characteristic of objective evaluation index to subjective evaluation index after the frame rate is corrected so that data sets of frame rates are on the same line.





DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT

Preferred embodiments of the present invention will be described hereinafter in detail with reference to the accompanying drawings. FIG. 1 is a block diagram of an automatic objective perceptual video quality evaluation apparatus according to an embodiment of the present invention. The automatic objective perceptual video quality evaluation apparatus receives two types of video signals, and analyzing the video signals, thereby finally outputting a subjective video quality estimated value. Meanwhile, an image corresponding to a video signal before being subjected to video-transmission-related image processing is denoted by “original video x”, and an image corresponding to a received transmission image and to be subjected to evaluation of a subjective quality according to the present invention is denoted by “evaluated video y”.


As shown in FIG. 1, the automatic objective perceptual video quality evaluation apparatus according to the embodiment is configured to include a feature amount extracting unit 1, a weighted sum calculating unit (or an objective perceptual video quality index calculating unit) 2, frame rate detecting unit 3, frame rate-specific correcting unit 4, and an objective evaluation index-subjective video quality mapping unit (or a subjective perceptual video quality estimated value deriving unit) 5. The feature amount extracting unit 1 is divided into functional units of a block distortion degree calculating unit 11, a PSNR (pixel to noise ratio) overall temporal fluctuation degree calculating unit 12, and a PSNR local temporal fluctuation degree calculating unit 13.


A configuration or function of each of the constituent elements of the automatic objective perceptual video quality evaluation apparatus according to the embodiment will be described in detail.


<Feature Amount Extracting Unit 1>

The feature amount extracting unit 1 extracts three video feature amounts necessary to derive a subjective video quality, that is, a block distortion degree P1, a PSNR overall temporal fluctuation degree P2, and a PSNR local temporal fluctuation degree P3. A method of deriving each of the video feature amounts will be described.


1. Block Distortion Degree P1

The block distortion degree calculating unit 11 calculates an intra-frame average dDC(f) of a DC difference between a pixel block 21 of an arbitrary size shown in FIG. 2 (8×8 pixel block in FIG. 2) and four adjacent blocks (a neighboring pixel block 25 on the right hand, a pixel block 22 on the lower left hand, a pixel block 23 under the pixel block 21, and a pixel block 24 on the lower right hand) for each of the original video x and the evaluated video y. Further, the block distortion degree calculating unit 11 calculates a difference between the intra-frame averages dDC(f) for the original video x and the evaluated video y, calculates a difference between an intra-sequence maximum value and an intra-sequence minimum value of the difference, and defines the difference as the block distortion degree P1. In the present specification, the term “sequence” means entirety of the original video x or evaluated video y used for a video quality evaluation, and the video quality evaluation is generally made on video for 5 to 15 seconds.






P
1=max{dDCRef(f)−dDCCod(f)}−min{dDCKRef(f)−dDCCod(f)}


In the equation, dDCRef(f) denotes the intra-frame average of the DC difference for the original video x, and dDCCod(f) denotes the intra-frame average of the DC difference for the evaluated video y. In the example shown in FIG. 2, the intra-frame average of the DC difference dDC(f) can be represented by the following Equation (1). In the Equation (1), NB denotes a total of pixel blocks in a frame.










dDC


(
j
)


=




b

frame







i


Adj


(
b
)











DC


(
b
)


-

DC


(
i
)





/
4

/

N
B








Equation






(
1
)








2. PSNR Overall Temporal Fluctuation Degree P2

The PSNR overall temporal fluctuation degree P2 is calculated using a maximum value, a minimum value, and an average value of an intra-sequence power error MSE (mean square error).


First, the maximum value, the minimum value, and the average value of the MSE between the original video x and the evaluated video y are defined. If the maximum value, the minimum value, and the average value of the MSE are denoted by emin, emax, and eave, respectively, they are defined as represented by the following Equation (2).











MSE


(
f
)


=




n

frame





{



x
Ref



(

f
,
n

)


-

xCod


(

f
,
n

)



}

/

N
P











e
min

=

min


{


MSE


(
f
)


|

f

sequence


}










e
max

=

max


{


MSE


(
f
)


|

f

sequence


}










e
ave

=





f

sequence




MSE


(
f
)



|

N
F







Equation






(
2
)








In the Equation (2), x(f, n) denotes a signal value of an nth pixel in the frame f, Np denotes the number of pixels in the frame, and Nf denotes the number of frames in a sequence. For example, if a video quality for ten seconds in which frames are updated 15 times per second is to be evaluated, the number of frames in the sequence is 150. If the sequence of the original video x and that of the evaluated video y differ in frame rate, then corresponding frames are detected by means such as frame matching means, and a PSNR between the corresponding frames is derived.


Next, the PSNR overall temporal fluctuation degree P2 based on the maximum value emax, the minimum value emin, and average value eave is calculated. As stated, the PSNR is significant information for estimating the subjective video quality. However, it is confirmed that the correlation between the objective video quality index and the subjective video quality tends to decrease if only the intra-sequence average value is used while the video quality has great temporal fluctuation in the sequence. Therefore, the PSNR overall temporal fluctuation degree P2 is defined as represented by the following Equation (3) according to deviations of the maximum value emax and the minimum value emin from the average value eave of the intra-sequence power error.










P
2

=

log






e
max

-

e
ave




e
ave

-

e
min





×

f


(

e
ave

)







Equation






(
3
)








In the Equation (3), f(eave) denotes a scaling function for changing a value according to the average value eave of the intra-sequence average MSE. As to the scaling function f(eave), an arbitrary function monotonically increasing in all ranges of the average value eave (which are, however, substantially in an range eave>0 according to the definition of eave) is available. Examples of the scaling function f(eave) include following functions.


Linear Characteristic Function

The linear characteristic function is defined as f(eave) eave. A linear characteristic thereof is that shown in FIG. 3.


Sigmoid Function

The sigmoid function has a characteristic of saturating in a high eave part and a low eave part. The sigmoid function is defined as represented by the following Equation (4).










f


(

e
ave

)


=



b
1


1
+



-


b
2



(


e
ave

-

b
3


)






+

b
4






Equation






(
4
)








The sigmoid function has a characteristic shown in FIG. 4. In FIG. 4, b1=10, b2=1, b3=25, and b4=10.


As can be seen from the property that the function f(eave) monotonically increases, the following effect can be produced according to a term of the function f(eave). If the average value eave is small, that is, the average MSE is small and the video quality of the evaluated video is high, the PSNR overall temporal fluctuation degree P2 is decreased. If the average value eave is large and the video quality of the evaluated video is low, the PSNR overall temporal fluctuation degree P2 is increased. Furthermore, if the sigmoid function is used as the scaling function f(eave), the property of saturating to certain values in regions on both ends shown in FIG. 4 can be added to the linear characteristic of the scaling function. The property may be regarded as a characteristic that reflects the saturation characteristic of human visual perception.


3. PSNR Local Temporal Fluctuation Degree P3

The low rate coding intended at multimedia applications tends to generate temporally local degradations in PSNR resulting from key frame insertion, scene change, occurrence of a sudden motion or the like. Due to this, degradations in the subjective video quality caused by these local degradations are detected based on the PSNR local temporal fluctuation degree P3.


As shown in FIG. 5, a V-shaped temporal change dPSNR(f) of the PSNR is calculated, and the dPSNR(f) is defined as the PSNR local temporal fluctuation degree P3. Specifically, if an index (a frame number) of a frame of interest is f, an absolute value of a difference between an average PSNR of the frames f−1 and f+1 before and after the frame f and the PSNR value of the frame f is defined as the dPSNR(f). A maximum value of the dPSNR(f) in the sequence is calculated and the calculated maximum value is defined as the PSNR local temporal fluctuation degree P3.





P3=max{dPSNR(f)|f ε sequence}  Equation (5)


The PSNR local temporal fluctuation degree P3 may be multiplied by a scaling function for changing a value according to the MSE of the frame f. As this scaling function, an arbitrary function that monotonically decreases according to the MSE is applicable.


<Weighted Sum Calculating Unit 2>

An objective evaluation index Qobj is defined as represented by the following equation using a weighted sum of the above-stated objective evaluation measures P1, P2, and P3.






Q
obj
=αP
1
+βP
2
+γP
3


In the equation, symbols α, β, and γ denote weight parameters. The weight parameters α, β, and γ are selected so that an estimated error of the objective video quality from the subjective video quality becomes minimum when the objective evaluation index Qobj is subjected to conversion processings by the frame rate-specific correcting unit 4 and the objective evaluation index-subjective video quality mapping unit 5. For example, the weight parameters α, β, and γ can be respectively set to 0.2, 0.4, and 0.004 (α=0.2, ⊖=0.4, and γ=0.004) The weight parameters α, β, and γ may be negative numbers.


<Frame Rate Detecting Unit 3>

The frame rate detecting unit 3 analyzes a video signal of the evaluated video y and outputs its frame rate. According to the present invention, it is premised that frame rate of the original video x is equal to or higher than that of the evaluated video y. Due to this, the frame rate detecting unit 3 detects the frame rate of the evaluated video y, which is lower than the frame rate of the original video x.


The frame rate detecting unit 3 outputs the detected frame rate to the frame rate-specific correcting unit 4.


<Frame Rate-Specific Correcting Unit 4>

If a correlation between the objective evaluation index Qobj output from the weighted sum calculating unit 2 and the subjective video quality (DMOS) is obtained, the correlation often differs in characteristics among frame rates a, b, c, etc. as shown in FIG. 6. The automatic objective perceptual video quality evaluation apparatus according to the embodiment is required to output a stable evaluation value without relying on the frame rate. Therefore, the frame rate-specific correction unit 4 absorbs the difference among the frame rates in the characteristic of the subjective evaluation measure to the objective evaluation measure using a correction characteristic (see FIG. 1), and corrects the evaluation value to an objective evaluation value irrespective of the frame rates. The correction method will be described below.


As shown in FIG. 7, a pair of data on a subjective evaluation value of video at the frame rate band data on an objective evaluation value approximated to the subjective evaluation value is set as (Qb, DMOSb), which pair is moved onto a characteristic line y=c0×Qobj+c1 of the frame rate a, based on which line the objective evaluation value is calculated. In this case, an objective evaluation value Qa is calculated so as to give the subjective evaluation value DMOSb on the characteristic line of the frame rate a. The objective evaluation value Qa thus obtained is regarded as the corrected objective evaluation value. Namely, a relationship represented by the following equation is obtained.






DMOSb=c
0
×Q
a
+c
1


The corrected objective evaluation value Qa is represented by the following equation.





Q
a
=DMOSb/c
0
−c
1


<Objective Evaluation Index-Subjective Video Quality Mapping Unit (Objective Video Quality Estimated Value Deriving Unit) 5>

Finally, if the relationship between the objective evaluation index Qobj and the subjective evaluation measure DMOS after the frame rate-specific correction is calculated using many samples, the relationship is shown in, for example, FIG. 8. FIG. 8 is a graph showing that the relationship between the objective evaluation index Qobj and the subjective evaluation measure DMOS is calculated using evaluated videos at frame rates of 3 fps (frames per second), 5 fps, 10 fps, and 15 fps. As obvious from FIG. 8, the relationship between the objective evaluation index Qobj and the subjective evaluation measure DMOS can be approximated to a polynomial function.


However, if these pieces of data are classified according to the frame rates, it is understood that data sets are irregular among the frame rates. Therefore, as shown in FIG. 9, if a regression curve is obtained for every frame rate, it is understood that regression curves differ in inclination among the frame rates and that data irregularity thereby occurs. Accordingly, the objective evaluation index Qobj is corrected so that the data sets on all the frame rates are on the same line.



FIG. 10 shows Qobj-DMOS characteristic after the correction stated above. The relationship between the objective evaluation index Qobj and the subjective evaluation measure DMOS shown in FIG. 10 can be approximated to, for example, a polynomial function represented by the following equation.






DMOS=−0.0035x3+0.1776x2−2.8234x+14.379 (where x=Qobj)


Therefore, this polynomial function is stored in the objective evaluation index-subjective video quality mapping unit (or the subjective video quality estimated value deriving unit) 5 in advance. The corrected objective video quality index Qobj is applied to the polynomial function, thereby deriving the subjective video quality estimated value. Namely, points on a solid-line curve shown in FIG. 10 indicate the estimated subjective video qualities corresponding to the objective video quality index Qobj.


As stated so far, according to the present invention, it is possible to estimate the objective video quality of the video at low resolution and low frame rate such as a multimedia video without relaying on subjective human judgment.


Needless to say, the methods of deriving the block distortion degree, the PSNR overall temporal fluctuation degree, and the PSNR local temporal fluctuation degree executed by the feature amount extracting unit 1, and the method of calculating the weighted sum executed by the weighted sum calculating unit 2 are given only for illustrative purposes. The other deriving methods and the other calculation method can be applied to the present invention.

Claims
  • 1. An objective perceptual video quality evaluation apparatus for estimating a subjective video quality by analyzing two types of video signals of an original video and an evaluated video, comprising: a feature amount extracting unit for extracting a block distortion degree of the evaluated video relative to the original video, a PSNR overall temporal fluctuation degree for frames in a sequence, and a PSNR local temporal fluctuation degree for each of the frames as feature amounts;an objective video quality index calculating unit for calculating a weighted sum of the block distortion degree, the PSNR overall temporal fluctuation degree, and the PSNR local temporal fluctuation degree, and calculating an objective video quality index;frame rate detecting unit for detecting frame rate of the evaluated video;a correcting unit for correcting the objective video quality index calculated by the objective video quality index calculating unit based on the frame rate detected by the frame rate detecting unit; anda subjective video quality estimated value deriving unit for deriving a subjective video quality estimated value by applying the objective video quality index corrected by the correcting unit to a correlation between the subjective video quality index and the objective video quality given in advance.
  • 2. The objective perceptual video quality evaluation apparatus according to claim 1, the objective video quality index calculating unitcalculates an intra-frame average value of a DC difference among a plurality of pixel blocks of each of the original video and the evaluated video and pixel blocks adjacent to the pixel block of interest,calculates an intra-sequence maximum difference between the intra-frame average value of the DC difference for the original video and the intra-frame average value of the DC difference for the evaluated video as a maximum value of the DC difference, andcalculates an intra-sequence minimum difference between the intra-frame average value of the DC difference for the original video and the intra-frame average value of the DC difference for the evaluated video as a minimum value of the DC difference, andobtains a difference between the maximum value of the DC difference and the minimum value of the DC difference as the block distortion degree.
  • 3. The objective perceptual video quality evaluation apparatus according to claim 1, the PSNR overall temporal fluctuation degree is derived based on a ratio of an absolute value of a difference between a maximum value and an average value of an MSE of each of the frames in the sequence to an absolute value of a difference between a minimum value and the average value of the MSE of each of the frames in the sequence.
  • 4. The objective perceptual video quality evaluation apparatus according to claim 3, the ratio of the absolute value of the difference between the maximum value and the average value of the MSE of each of the frames in the sequence to the absolute value of the difference between the minimum value and the average value of the MSE of each of the frames in the sequence is multiplied by a scaling function for changing a value according to an average value of the MSE in the sequence
  • 5. The objective perceptual video quality evaluation apparatus according to claim 1, the PSNR local temporal fluctuation degree is an intra-sequence maximum value of a PSNR value of the frame of interest and PSNR values of the adjacent frames before and after the frame of interest.
  • 6. The objective perceptual video quality evaluation apparatus according to claim 5, the PSNR local temporal fluctuation degree is obtained by multiplying the intra-sequence maximum value of the PSNR difference between the adjacent frames by a scaling function for changing a value according to the MSE value of the frame of interest.
Priority Claims (1)
Number Date Country Kind
2006-208091 Jul 2006 JP national