This application claims the benefit of China application Serial No. CN202310778620.7, filed on Jun. 28, 2023, the subject matter of which is incorporated herein by reference.
The present application relates to a bit rate control device and a bit rate control method, and more particularly, to a rate control device and a rate control method which determine a bit distribution mode using one or more confidences.
A conventional perceptual macroblock-level rate (PMBR) control technique is primarily based on visual prominence as a weighting for rate distribution. As the visual prominence increases, the weighting increases and a corresponding quantization parameter (QP) decreases. A modeling method of the visual prominence above mainly relies on such as a gradient of the luminance component, the luminance component, a variance of the luminance component and motion information.
However, even if the PMBR technique uses the visual prominence above for regulation, it remains challenging to completely reflect the effect of the regulation on subjective viewing quality of a user. Moreover, modeling based on the luminance component faces a great restriction. Modeling based on the gradient of the luminance component contains issues of unsatisfactory improvement in details and severe mosaic effects on flat regions. In addition, as the number of metrics taken into consideration by the visual prominence gets larger, the improvement on subjective viewing quality of a user further diminishes. Furthermore, a bit rate control policy executed based on the quantization parameters also leads to insufficient accuracy of bit control. If an intensity of the PMBR control cannot be freely regulated, application scenarios may also be seriously limited.
In view of the drawbacks of the prior art, it is an object (for example but not limited to) of the present application to provide a bit rate control device so as to improve the prior art.
In some embodiments, a bit rate control device includes a confidence calculation circuit, a blend calculation circuit and a bit distribution circuit. The confidence calculation circuit calculates a texture confidence, a person confidence and a motion confidence according to an image. The blend calculation circuit determines, according to a mode selection signal, whether select one of the person confidence and the motion confidence for blend calculation with the texture confidence to generate a blend parameter. The bit distribution circuit generates a bit distribution parameter according to the blend parameter.
In some embodiments, a bit rate control method is applied to a bit rate control device. The bit rate control method includes: calculating a texture confidence, a person confidence and a motion confidence according to an image; determining, according to a mode selection signal, whether to select one of the person confidence and the motion confidence for blend calculation with the texture confidence to generate a blend parameter; and generating a bit distribution parameter according to the blend parameter.
The bit rate control device and the bit rate control method of the present application, in response to different application scenarios, are able to selectively determine whether to select one of the person confidence and the motion confidence for blend calculation with the texture confidence to generate a blend parameter and then accordingly determine an encoding parameter such as a bit distribution parameter, thereby improving viewing quality of a person part (for example, a human face) or a motion part.
To better describe the technical solution of the embodiments of the present application, drawings involved in the description of the embodiments are introduced below. It is apparent that, the drawings in the description below represent merely some embodiments of the present application, and other drawings apart from these drawings may also be obtained by a person skilled in the art without involving inventive skills.
The term “coupled” or “connected” used in the literature refers to two or multiple elements being directly and physically or electrically in contact with each other, or indirectly and physically or electrically in contact with each other, and may also refer to two or more elements operating or acting with each other. As given in the literature, the term “circuit” may be a device connected by at least one transistor and/or at least one active element by a predetermined means so as to process signals.
In the field of video encoding, under the restriction of transmission bandwidth, it is necessary to efficiently apply a video encoding scheme under such limited transmission bandwidth, so as to improve subjective viewing quality. To achieve the above object, the present application provides a video encoding system and a bit rate control device to be described in detail below.
In some embodiments, the confidence calculation circuit 210 calculates a texture confidence, a person confidence and a motion confidence according to an input image. Then, the blend calculation circuit 220 determines, according to a mode selection signal, whether to select one of the person confidence and the motion confidence for blend calculation with the texture confidence to generate a blend parameter. Next, the bit distribution circuit 230 generates a bit distribution parameter according to the blend parameter.
The bit rate control device 200 of the present application is operable in a first mode according to the mode selection signal and perform bit distribution by using the texture confidence as a reference. Thus, the bit rate control device 200 is able to save code words by using macroblocks having high texture confidences. Moreover, the bit rate control device 200 is also operable in a second mode according to the mode selection signal, so as to blend the texture confidence and the person confidence to perform bit distribution. As such, the bit rate control device 200 is able to distribute more code words for macroblocks associated with persons, so as to improve subjective viewing quality of a person part (for example, a human face). Moreover, the bit rate control device 200 is further operable in a third mode according to the mode selection signal, so as to blend the texture confidence and the motion confidence to perform bit distribution. As such, the bit rate control device 200 is able to distribute more code words for macroblocks associated with motion, so as to improve subjective viewing quality of a motion part.
In some embodiments, the confidence calculation circuit 210 includes a texture confidence calculator 211. To better understand the operation of the texture confidence calculator 211, referring to both
As shown in
In some embodiments, the gradient graph calculation element 2111 performs convolution on an input image I to obtain at least two gradient graphs Gx and Gy. For example, the gradient graph calculation element 2111 may perform 8× sampling on an original frame of the input image I to obtain a plurality of macroblocks, each of which having dimensions of 8×8. The gradient graph calculation element 2111 performs convolution on the input image I according to an operator Kx in the X direction and an operator Ky in the Y direction to obtain gradient graphs Gx and Gy, using equations below:
In some embodiments, after at least two gradient graphs Gx and Gy are obtained, the gradient direction calculation element 2113 calculates at least two gradient directions PhaFlag according to the at least two gradient graphs Gx and Gy. For example, the gradient direction calculation element 2113 calculates 8 gradient directions PhaFlag according to the gradient graphs Gx and Gy. To better understand the calculation for the gradient direction PhaFlag, referring to both
Referring to both
In equation (5), PhaFlag is the gradient direction, mod is a remainder operation, arctan is an arctangent function, a value range of arctan is in (−π/2, π/2) and this can convert angles in the second and third quadrants to the first and fourth quadrants. Refer to the following equations:
As shown in equations (6) and (7), the angles in the second quadrant and the third quadrant can be converted to the first quadrant and the fourth quadrant.
Moreover, the gradient direction calculation element 2113 further performs the operations below so as to obtain the gradient direction PhaFlag:
As shown in equation (8), abs is an absolute value operation. If the gradient graphs Gx and Gy yield results smaller than a predetermined threshold Th1 after the calculation of equation (8), PhaFlag is 8, and it means that it's a flat region which is non-directed. For example, if a size of one pixel ranges between 0 and 255, the predetermined threshold Th1 can be set to 0.04*256, which is about 10.24. Moreover, as shown in equations (9) to (18), the corresponding gradient direction PhaFlag can be calculated according to the gradient graphs Gx and Gy.
In some embodiments, after the at least two gradient directions PhaFlag are obtained, the anisotropic calculation element 2115 performs standard deviation calculation on the at least two gradient directions PhaFlag to obtain a degree of anisotropy Aiso. For example, the anisotropic calculation element 2115 counts a strip vector bin[i] of the gradient direction PhaFlag in 8×8 neighborhoods according to the gradient direction PhaFlag, and counts the number of PhaFlag≠8 and regards it as an effective number vnum, thereby calculating the simplified standard deviation, as an equation below:
In equation (19), Aiso is the degree of anisotropy, abs is an absolute value operation, bin[i] is the strip vector, avgbin is an average value of the strip vector, and Norm is a normalization parameter. A large degree of anisotropy indicates that a direction in the neighborhoods forms a main direction. A small degree anisotropy indicates that there is no prominent direction in the neighborhoods, and various directions are in undirected existences. A calculation equation for avgbin above is as below:
In equation (20), avgbin is an average value of the strip vector, bin[i] is the strip vector, and vnum is an effective number.
In some embodiments, the gradient magnitude calculation element 2117 obtains a gradient magnitude GradMad according to the least two gradient graphs Gx and Gy. For example, an equation for the gradient magnitude GradMad is as follows:
As shown in equation (21), GradMad is the gradient magnitude, max is a maximum value operation, and abs is an absolute value operation.
In some embodiments, a calculation equation for a degree of isotropy is as below:
In equation (22), iso is the degree of isotropy, and Aiso is the degree of anisotropy.
In some embodiments, after the gradient magnitude GradMad and the isotropy iso are obtained, the texture confidence calculation element 2119 calculates a texture confidence Tmap according to the gradient magnitude GradMad and the isotropy iso. For example, an equation for the texture confidence Tmap is as below:
In equation (23), Tmap is the texture confidence, GradMad is the gradient magnitude, and iso is the degree of isotropy. As shown in equation (23), the texture Tmap needs to have a certain gradient as well as undirected property. For a region having a high texture confidence, the present application is able to save a rate by bit rate control without incurring severe deterioration in subjective viewing quality. The region having a high texture confidence described above usually occurs in a region with chaotic directions and a large gradient magnitude.
In some embodiments, referring to
In some embodiments, the skin color region determination element 2131 performs skin color determination on an image to generate a skin color region Sf. For example, under an original resolution, the skin color region determination element 2131 performs skin color determination by using a blue chrominance Cb and a red chrominance Cr. To better understand the method for determining the skin color region Sf, referring to both
As shown in equations (24) and (25), to determine an input image to have a skin color, a target blue chrominance PCb thereof needs to be between the upper limit Cbuper and the lower limit Cblower of the blue chrominance Cb, a target red chrominance PCr thereof needs to be between the upper limit Cruper and the lower limit Crlower of the red chrominance Cr, and the target blue chrominance PCb and the target red chrominance PCr need to satisfy equations (26) and (27). The skin color region determination element 2131 performs skin color determination on the image according to the setting above to generate a skin color region Sf.
Referring to
In equation (28), Sfp is a morphological processing result, dilate is a dilation operation, erode is an erosion process, Sf is the skin region, and el is as follows:
In some embodiments, after the morphological processing result Spf is obtained, the sampling element 2135 samples the morphological processing result Spf to obtain a person confidence Smap. For example, the sampling element 2135 performs 4× sampling on the morphological processing result Spf to obtain the person confidence Smap, as an equation below:
In equation (30), Smap is the person confidence, and Sfp is the morphological processing result.
In some embodiments, referring to
Next, the motion confidence calculator 215 calculates a frame difference according to the original frame macroblock and the previous frame macroblock. For example, the motion confidence calculator 215 calculates a frame difference between the original frame macroblock and the previous frame macroblock, as an equation below:
In equation (31), Diffmask (i, j) is the frame difference, abs is an absolute value operation, Y(i, j) is the original frame macroblock, Ypre(i, j) is the previous frame macroblock, and Th2 is a predetermined threshold. For example, if a size of one pixel ranges between 0 and 255, the predetermined threshold Th2 can be set to 0.1*256, which is about 25.6. In i∈(0, W/8-1) above, j∈(0, H/8-1), W is a macroblock width, and H is a macroblock length.
Then, the motion confidence calculator 215 calculates the motion confidence according to the frame difference Diffmask(i, j). For example, the motion confidence calculator 215 calculates the motion confidence within a range of a coding unit (CU) 32 according to the frame difference Diffmask(i, j), wherein a size of the coding unit CU 32 is 32×32, and an equation for the above is as below:
In equation (32), Mmap is the motion confidence, and Diffmask(i, j) is the frame difference.
Referring to
Referring to
First of all, the default setting device 7211 sets the texture confidence Tmap as a default value. For example, the default setting device 7211 receives the texture confidence Tmap transmitted from the confidence calculation circuit 710, and sets the texture confidence Tmap as a default value, with an equation as below:
In equation (33), JNDCU8(i, j) is the default value, and Tmap(i, j) is the texture confidence.
In some embodiments, the selector 7213 transmits the default value to the texture blend calculator 7215, the person blend calculator 7217 or the motion blend calculator 7219 according to a mode selection signal S. Referring to both
Moreover, the mode selection signal S may also be set by a user. Assuming that an image to be watched by a user is an interview talk show, the user may set the person confidence to perform the subsequent bit distribution. Alternatively, if an image to be watched by a user is a sports game, the user may set the motion confidence to perform the subsequent bit distribution.
Then, the selector 7213 may transmit the default value JDNCU8 to the texture blend calculator 7215, the person blend calculator 7217 or the motion blend calculator 7219 according to the mode selection signal S, so as to perform blend calculation.
A so-called just-noticeable distortion (JND) in the default value JDNCU8 is an error between a decoded signal and a real signal when a user is just able to subjectively notice a quality deterioration. The just-noticeable distortion JND increases as the texture confidence of a region increases. The just-noticeable distortion JND decreases as the person confidence of a region increases. The just-noticeable distortion JND decreases as the motion confidence of a region increases. Adjustment may be made for a region having a large JND so as to save code words, and to provide more code word resources for a region having a small JND.
The blend sub-circuit 721 in
Referring to
In some embodiments, the person blend calculator 7217 calculates a person flatness according to the texture confidence in response to the mode selection signal S. For example, if the person confidence Smap is greater than the person confidence threshold, the person blend calculator 7217 adjusts a region corresponding to the person confidence Smap greater than the person confidence threshold. Associated details are as given in the description below. First of all, the selector 7213 may transmit the default value JNDCU8 to the person blend calculator 7217 according to the mode selection signal S, that is, transmitting the texture confidence to the person blend calculator 7215. Then, the person blend calculator 7217 calculates the person flatness according to the texture confidence, as an equation below:
In equation (34), flatRatio1(i, j) is the person flatness, Tmap (i, j) is the texture confidence, JNDCUMAX and JNDCUMIN represent a maximum value and a minimum value in the previous frame macroblock JNDCU32 of the input image, CLIP is a specified range operation so as to limit a difference (Tmap (i, j)−JNDCUMIN) to be within a range between 0 and (JNDCUMAX−JNDCUMIN).
In some embodiments, after the person flatness is obtained, the person blend calculator 7217 performs blend calculation according to the texture confidence, a person reinforcement parameter and the person flatness to generate the blend parameter. For example, an equation of the above blend parameter is as below:
In equation (35), JNDCU8_blend(i, j) is the blend parameter, JNDCU8(i, j) is the texture confidence, sstr is the person reinforcement parameter, and flatRatio1 is the person flatness. The person reinforcement parameter sstr ranges between 0 and 1 and determines an intensity of a person reinforcement style, wherein an intensity of the person reinforcement style gets larger as a value of the person reinforcement parameter sstr increases.
In some embodiments, the motion blend calculator 7219 calculates a motion flatness according to the texture confidence in response to the mode selection signal S. For example, if the motion confidence Mmap is greater than the motion confidence threshold, the motion blend calculator 7219 adjusts a region corresponding to the motion confidence Mmap greater than the motion confidence threshold. Associated details are as given in the description below. First of all, the selector 7213 may transmit the default value JNDCU8 to the motion blend calculator 7219 according to the mode selection signal S, that is, transmitting the texture confidence to the motion blend calculator 7219. Then, the motion blend calculator 7219 calculates the motion flatness according to the texture confidence, as an equation below:
Compared to equation (34), in equation (36), flatRatio2(i, j) is the motion flatness, and the remaining parameters are described in equation (34) and related details are omitted herein.
In some embodiment, after the motion flatness is obtained, the motion blend calculator 7219 performs blend calculation according to the texture confidence, the motion reinforcement parameter and the motion flatness to generate the blend parameter. For example, an equation for the blend parameter above is as below:
Compared to equation (35), in equation (37), mstr is the motion reinforcement parameter, and flatRatio2 is the motion flatness. The motion reinforcement parameter mstr ranges between 0 and 1 determines an intensity of a motion reinforcement style, wherein an intensity of the motion reinforcement style gets larger as a value of the motion reinforcement parameter mstr increases. The remaining parameters are described in equation (35) and related details are omitted herein.
Referring to
In equation (38), JNDCU32 is the sampling parameter, and JNDCU8_blend is the blend parameter.
Referring to
It should be noted that, the maximum value JNDCUMAX and the minimum value JNDCUMIN used for calculation of the original frame are both obtained from the previous frame of the image.
Referring to
For example, first of all, it is defined that pl is an adjustment strength and a range of the adjustment strength pl ranges between 1 and 5. To better understand the adjustment strength pl, referring to
In equation (42), bppCU32 is the bits per pixel of each macroblock CU32, tbpp is the target bits per pixel output from the frame-level bit rate control device 110 in
In some embodiments, after the bits par pixel bbpCU32 of each macroblock CU32 is obtained, the bit distribution circuit 730 obtains a quantization parameter QCPU32 corresponding to the bits per pixel of the macroblock CU32 according to the quantization parameter QP output from the frame-level bit rate control device 110 in
In equation (44), QPCU32 is the corresponding quantization parameter, LUT is a look-up table operation, and bppCU32 is the bits per second of the macroblock CU32. After the quantization parameter QPCU32 corresponding to the bits per pixel of the macroblock CU32 is obtained, the bit control device 700 of the present application may accordingly perform the subsequent bit distribution.
In some embodiments, after the quantization parameter QPCU32 is obtained, the bit distribution circuit 730 defines a range of the quantization parameter QPCU32, as an equation below:
In equation (45), QPCU32 is the corresponding quantization parameter, CLIP is described in equation (34) and related details are omitted herein, QPavg is an average quantization parameter, and DQP is a quantization parameter difference.
Referring to step 1100, a texture confidence, a person confidence and a motion confidence may be calculated by the confidence calculation circuit 210 according to an image.
Referring to step 1200, according to a mode selection signal, it may be determined by the blend calculation circuit 220 whether to select one of the person confidence and the motion confidence for blend calculation with the texture confidence to generate a blend parameter.
Referring to step 1300, a bit distribution parameter may be generated by the bit distribution circuit 230 according to the blend parameter. It should be noted that, operation details of steps 1100 to 1300 of the bit rate control method 1000 are given in the description associated with the embodiments in
It should be noted that, the present application is not limited to the embodiments shown in
In conclusion, the bit rate control device and the bit rate control method of the present application, in response to different application scenarios, are able to selectively determine whether to select one of the person confidence and the motion confidence for blend calculation with the texture confidence to generate a blend parameter and then accordingly determine an encoding parameter such as a bit distribution parameter, thereby improving viewing quality of a person part (for example, a human face) of a motion part.
While the present application has been described by way of example and in terms of the preferred embodiments, it is to be understood that the disclosure is not limited thereto. Various modifications made be made to the technical features of the present application by a person skilled in the art on the basis of the explicit or implicit disclosures of the present application. The scope of the appended claims of the present application therefore should be accorded with the broadest interpretation so as to encompass all such modifications.
Number | Date | Country | Kind |
---|---|---|---|
202310778620.7 | Jun 2023 | CN | national |