ITU-T H.264/MPEG-4 part 10 is a recent international video coding standard, developed by Joint Video Team (JVT) formed from experts of International Telecommunications Union Telecommunication Standardization Sector (ITU-T) Video Coding Experts Group (VCEG) and International Organization for Standardization (ISO) Moving Picture Experts Group (MPEG). ITU-T H.264/MPEG-4 part 10 is also referred to as MPEG-4 AVC (Advanced Video Coding). MPEG-4 AVC achieves data compression by utilizing the advanced coding tools, such as spatial and temporal prediction, blocks of variable sizes, multiple references, integer transform blended with quantization operation, entropy coding, etc. MPEG-4 AVC supports adaptive frame and field coding at picture level. MPEG-4 AVC is able to encode pictures at lower bit rates than older standards but maintain at least the same quality of the picture.
Rate control is an engine that dynamically adjusts encoding parameters so that the resulting compressed bit rate can meet a target bit rate. Rate control is important to regulate the encoded bit stream to satisfy the channel condition and to enhance the reconstructed video quality. However, in actuality, single-pass rate control for an MPEG-4 AVC often results in uneven quality within a picture as well as from picture to picture. For example, there may be serious pulsing problems around instantaneous decoding refresh (IDR) picture of MPEG-4 AVC with single-pass rate control. Many of the causes of the uneven quality result from the inability to accurately estimate a target bit rate for future pictures that have yet to be encoded in the stream.
Additionally, in instances where a target bit rate is estimated, a further difficulty may arise in controlling the actual bit rate to achieve the target bit rate. The inability to control the bit rate may affect buffers in encoders used to encode the bit stream.
Disclosed herein is a two-pass encoder configured to determine a quantization parameter (QP) value to control an actual number of bits consumed in a second encoding pass, according to an embodiment. The two-pass encoder includes a first encoding module, a rate control module and a second encoding module. The first encoding module includes a circuit configured to perform a first encoding pass to encode input video sequences. The rate control module is configured to determine R. R is a target bit rate for a picture in the second encoding pass. The rate control module may determine Q using an adaptive Q-R model to achieve R. Q is a QP value for the picture in the second encoding pass. In the Q-R model, the rate control module uses a control variable α, which is dependent on a QP value range, a picture type, and complexity. The second encoding module is configured to use Q to encode the picture in the input video sequence in the second encoding pass to form an output bitstream. The rate control module is further configured to update α to encode a next picture in the second encoding pass.
Also disclosed herein is a method of determining a QP value for a macroblock (MB) within a picture to control an actual number of bits consumed in a second encoding pass of a two-pass encoder, according to an embodiment. In the method, R, a target bit rate for the picture in the second encoding pass, is determined. Q, a QP value for the MB of the picture in the second encoding pass, is determined. The QP value may be determined using a virtual buffer model or an adaptive Q-R model. In the virtual buffer model, the virtual buffer may be an actual buffer storing the encoded output bitstream as it is being transmitted on a channel to other devices or the virtual buffer may be an assigned portion of a buffer. α, a control variable dependent on a QP range, a picture type, and complexity is used in the adaptive Q-R model. Q is used to encode the MB of the picture in the second encoding pass to form an output bitstream. Thereafter, the virtual buffer fullness or the α value is updated to encode a next MB of the picture in the second encoding pass.
Further disclosed is a computer readable storage medium on which is embedded one or more computer programs implementing the above-disclosed method of determining a QP value to control an actual number of bits consumed in a second encoding pass of a two-pass encoder, according to an embodiment.
As described above, the embodiments utilize a two-pass encoder, and rate control is achieved by adjusting a QP value so that an actual rate is approximately equal to the target bit rate for encoding the pictures in the second encoding pass. The QP value may be adjusted at a picture level or an MB level. Further, the QP value may be adjusted so that an actual bit rate closely approximates a target bit rate. Alternately, the QP value may be adjusted to constrain a bit rate based on a virtual buffer fullness.
Features of the present invention will become apparent to those skilled in the art from the following description with reference to the figures, in which:
For simplicity and illustrative purposes, the present invention is described by referring mainly to exemplary embodiments thereof. In the following description, numerous specific details are set forth to provide a thorough understanding of the present invention. However, it will be apparent to one of ordinary skill in the art that the present invention may be practiced without limitation to these specific details. In other instances, well known methods and structures have not been described in detail to avoid unnecessarily obscuring the present invention.
1. Functional Diagram of Two-Pass Encoder
The rate control module 105 determines R for a picture in the second encoding pass using the coding statistics from the first encoding pass. R is a target bit rate for the picture in the second encoding pass. The rate control module 105 may determine Q for the picture in the second encoding pass in order to achieve the target rate R. Q is a QP value for the picture in the second encoding pass. The rate control module 105 may determine Q using an adaptive Q-R model, in which is a control parameter, α, dependent on a QP value range, a picture type, and complexity. The second encoding module 103 is configured to use Q to encode the picture in the input video sequence 101 in the second encoding pass to form an output bitstream 110. The rate control module 105 thereafter updates α to encode a next picture in the second encoding pass as further described below.
Alternatively, the rate control module 105 may control the rate of the output bitstream 110 by adjusting the QP value per MB within a picture to achieve the target rate R. The rate control module 105 may adjust the QP value per MB of the picture using a virtual buffer (not shown). The virtual buffer may be an actual buffer storing the encoded output bitstream as it is being transmitted on a channel to other devices or the virtual buffer may be an assigned portion of a buffer. The rate control module 105 may adjust the QP value per MB of the picture using an adaptive Q-R model at MB level.
The two-pass MPEG-4 AVC encoder 100 includes a hardware, such as a processor or other circuit for encoding. It should be understood that the two-pass MPEG-4 AVC encoder 100 depicted in
According to an embodiment, the first encoding module 102 and the second encoding module 103 are configured as partial encoders. In the two-pass MPEG-4 AVC encoder 100, motion estimation (ME), which is the most time consuming task in an MPEG-4 AVC encoder, and code mode selection are not duplicated in the first encoding module 102 and the second encoding module 103. Instead, tasks are shared by the first encoding module 102 and the second encoding module 103. For instance, the first encoding module 102 may perform ME at full-pel resolution to form full-pel motion vectors (MVs) with associated reference indexes (refldx) and eliminate a large number of possible code modes per MB to form a limited number of candidate code modes. The second encoding module 103 may thereafter refine the full-pel MVs at quarter-pel resolution and select a final code mode from among the limited number of candidate code modes.
The first encoding pass and the second encoding pass are performed approximately in parallel with an offset provided by the delay 107. The coding statistics 104 from the first encoding pass may thereby be used in the second encoding pass. The first encoding pass is ahead of the second encoding pass by an approximately constant number of pictures, for example, the delay 107 may be 30 pictures. The delay 107 may also be measured in time, for instance 1 second. Because the first encoding pass is ahead of the second encoding pass, the first encoding pass may provide the coding statistics 104 for the second encoding pass before the second encoding module 103 starts to process the pictures. This includes sending the coding statistics from the first encoding pass to the rate control module 105 of the second encoding module 103 to be used in the second encoding pass to generate target coding parameters 109 which are thereafter used in the second encoding pass.
The rate control module 105 receives the coding statistics 104 from the first encoding module 102. The coding statistics 104 include, for instance, QPs, and a number of bits generated for each picture in the first encoding pass. The rate control module 105 is configured to generate the target coding parameters 109 using the coding statistics 104. The target coding parameters 109 include, for instance, Rtwo,Ff,picType(i), a target number of bits for each picture i in the second encoding pass, a target number of bits budgeted for a group of pictures (GoP) in the second encoding pass, and QP(s). The rate control module 105 is configured to control an actual number of bits consumed for picture i to approximate Rtwo,Ff,picType(i). The rate control module 105 may control the actual number of bits by adjusting the QP value at either picture level as shown with respect to
The second encoding module 103 encodes the input video sequence 101 using the target coding parameters 109 and coding information, such as MVs and associated refldx and candidate code modes per MB, from the first encoding pass 104. The second encoding module 103 then outputs an output bitstream 110. The rate control module 105 also updates α to encode a next picture or next MB in the second encoding pass.
2. Adjusting Quantization Parameters for Rate Control in the Two-Pass MPEG-4 AVC Encoder
Examples of methods in which the two-pass MPEG-4 AVC encoder 100 may be employed to control an actual number of bits consumed in a second encoding pass are now described with respect to the following flow diagrams of the methods 200-250 depicted in
Some or all of the operations set forth in the methods 200-250 may be contained as one or more computer programs stored in any desired computer readable medium and executed by a processor on a computer system. Exemplary computer readable media that may be used to store software operable to implement the present invention include but are not limited to conventional computer system RAM, ROM, EPROM, EEPROM, hard disks, or other data storage devices.
The method 200, as shown in
Q=−α×log2 R+β, Equation (1)
in which α and β are values adjusted for QP value ranges, for picture type, and for complexity of the picture i. α and β are control variables. The picture types may include I, P, Bs, or B pictures, and depending on the picture type α and β may be adjusted differently. Additionally, α and β may be adjusted differently based on a measure of complexity of scene content in a picture, and/or based on different ranges of the QP values.
At step 201, as shown in
At step 202, the rate control module 105 determines R, a target bit rate for encoding a picture i from the input video sequence 101 in the second encoding pass. The rate control module 105 may determine R in the method 200 as Rtwo,Ff,picType(i), which is the target number of bits for the picture i. For instance, the rate control module 105 may determine the target number of bits for the picture i based upon picture type of I, P, Bs, or B picture, and whether the picture is in frame or field (picture i of picTypeε{I,P,Bs,B} in Ffε{frame, field}) in the second encoding pass. The target number of bits may also be determined based upon a complexity of the picture.
At step 203, the rate control module 105 determines Q based on Rtwo,Ff,picType(i) and α. α is a control variable dependent on a QP value range, a picture type, and complexity. The complexity is determined for the picture i. Q is a QP value for the picture i in the second encoding pass.
According to an embodiment, the rate control module 105 applies an adaptive Q-R model to determine a QP value for the picture. A Q-R model is a state space representation of possible behaviors that may occur over time starting from an initial scenario. For instance, the rate control module 105 may use an adaptive Q-R model as determined by Equation (1) hereinabove. The application of the adaptive Q-R model to the QP in the second encoding pass may be determined as follows.
The rate control module 105 may determine a complexity of the picture using an equation
in which Q is the QP value applied to the picture and R is the corresponding bits generated, and c is a constant that takes one of six values, depending upon the QP value.
Equation (2) is converted into a Q-R model using an equation
Q=−6×log2 R+β. Equation (3)
The Q-R model may be used in both the first encoding pass and the second encoding pass as shown in the following equations
Qone,Ff,picType(i)=−6×log2 Rone,Ff,picType(i)+β, and Equation (4)
Qtwo,Ff,picType(i)=−6×log2 Rtwo,Ff,picType(i)+β. Equation (5)
In which Qone,Ff,picType(i) and Qtwo,Ff,picType(i) are respectively the QP value for the picture i in the first encoding pass and the second encoding pass, and Rone,Ff,picType(i) and Rtwo,Ff,picType(i) are respectively the number of bits for the picture i in the first encoding pass and the second encoding pass using Qone,Ff,picType(i) and Qtwo,Ff,picType(i).
Further using Equations (4) and (5), for the picture i of picTypeε{I,P,Bs,B} in Ffε{frame, field} in the second encoding pass, given a target number of bits, Rtwo,Ff,picType(i), the QP value may be determined using an equation
Equation (6) gives a global (or average) QP for the picture i of picTypeε{I,P,Bs,B} in Ffε{frame, field}, with which the number of bits generated for the picture i by the second encoding module 103 may approximate the target number of bits for the picture i, Rtwo,Ff,picType(i).
Equation (3) is an approximate Q-R model per picture. In test cases, with use of the QP value determined using Equation (6) which is determined based upon Equation (3), an actual bits generated from the second encoding pass has been observed to diverge from the target number of bits, Rtwo,Ff,picType(i). A fixed Q-R model, such as Equation (3), is not able to cover entire QP ranges, all picture types, and all pictures of varying complexity to an acceptable approximation.
However, a relationship between Q and R may be modeled to an adaptive approximation using Equation (1) as shown hereinabove. With the Q-R model of Equation (1), the relationship between Q and R for the first encoding pass and the second encoding pass may be determined using equations,
Qone,Ff,picType(i)=−αone,Ff,picType(i)×log2 Rone,Ff,pictype(i)+βone,Ff,picType(i), and Equation (7)
Qtwo,Ff,picType(i)=−αtwo,Ff,picType(i)×log2 Rtwo,Ff,pictype(i)+βtwo,Ff,picType(i) Equation (8)
If the first encoding pass and the second encoding pass maintain a same picture type of I, P or B and a same picture structure of frame or field, the α and β values in the first encoding pass and the second encoding pass may be approximately equal. Therefore, using Equations (7) and (8), the QP value for the picture i of picTypeε{I,P,Bs,B} in Ffε{frame, field} in the second encoding pass may be determined using an equation,
Equation (6) differs from Equation (9) in using a constant value of 6 for Equation (6) versus an adjustable α value of αFf,picType(i) for Equation (9), which is a function of the picture i of picTypeε{I,P,Bs,B} in Ffε{frame, field}. Because the α value in the Q and R model per picture may vary from scene to scene, the rate control module 105 may adaptively correct αFf,picType(i) along a time domain. The adaptation may be set at a picture level, at a scene level, or at instances where an adjustment is required.
According to an embodiment, given a current picture i of picTypeε{I,P,Bs,B} in Ffε{frame, field} a global QP value may be determined using Equation (9).
At step 204, the second encoding module 103 encodes the input picture sequence using Q. For each picture in the second encoding pass Q is determined as Qtwo,Ff,picType(i) in Equation (9) hereinabove. The second encoding module 103 uses Qtwo,Ff,picType(i) to encode the picture, resulting in an output bitstream 110 for the picture with a rate of
At step 205, the rate control module 105 updates α to encode a next picture in the second encoding pass. The rate control module 105 may update the α value using an equation
The updated α is used to encode a next picture in the input video sequence 101. The initial α value may be set to 6, as shown in equation (6), or may be initialized to another value.
The method 210, as shown in
At step 211, as shown in
At step 213, if the rate control module 105 determines that additional processing cycles are available, the rate control module 105 may perform the iterative process in updating α for the current picture i in the second encoding pass. According to an embodiment, for the current picture i with a given target rate, Rtwo,Ff,picType(i), to begin the iterative process, the rate control module 105 is configured to set an initial iteration index j=0 along with an initial α value α2,j=αFf,picType(i) derived from a picture immediately preceding the current picture of the same type, an initial QP value Q2,j=Qtwo,Ff,picType(i) calculated using the initial value α2,j=αFf,picType(i), and an initial output rate R2,j=
At step 214, the rate control module 105 determines whether the output bit rate R2,j is equal to Rtwo,Ff,picType(i), the target bit rate for a current picture i. The rate control module 105 may use an equation
Δ2,j=R2,j−Rtwo,Ff,picType(i), Equation (11)
in which Δ2,j is a difference between R2,j and Rtwo,Ff,picType(i). The rate control module 105 determines whether Δ2,j=0. If Δ2,j=0, the rate control module 105 ends the iterative process at step 215. The initial α value is the α value for next picture. If Δ2,j is not equal to 0, then at step 216, the rate control module 105 sets Q2,j+1, in which Q2,j+1 is a new QP value for the picture i, using an equation
At step 217, the second encoding module 103 encodes the picture i using Q2,j+1 to form an output bitstream 110 with a rate of R2,j+1. Thereafter, at step 218, the rate control module 105 determines whether the output bit rate R2,j+1 is equal to Rtwo,Ff,picType(i). The rate control may calculate a difference between R2,j+1 and Rtwo,Ff,picType(i) using an equation
Δ2,j+1=R2,j+1−Rtwo,Ff,picType(i). Equation (13)
At step 219, if Δ2,j+1=the rate control module 105 ends the iterative process. At step 220, if Δ2,j+1 is not equal to 0, the rate control module 105 updates α using an equation
At step 221, the rate control module 105 determines if Δ2,j+1 and Δ2,j have different signs. If Δ2,j+1 and Δ2,j have different signs, for instance Δ2,j+1 is negative and Δ2,j is positive, at step 222, the rate control module 105 sets α value for next picture i+1 using an equation
The rate control module 105 then ends the iterative process.
At step 223, however, if Δ2,j+1 and Δ2,j have a same sign, the rate control module 105 sets j=j+1. The rate control module 105 calculates a new QP value using a linear model to be
or using a non-linear model to be
and the second encoding module 103 uses the new QP value to encode the picture i at step 217 again, resulting in an output bitstream 110 with a rate of R2,j+1.
Among all the QP values that have been used, the one with an output rate closest to the target rate is the final QP for the current picture. The α value may be updated only once per picture in one embodiment.
The method 230, as shown in
At step 231, as shown in
At step 232, the rate control module 105 sets initial target coding parameters for a current picture i in the second encoding pass. These may be predetermined parameters. For instance the rate control module 105 may set a target number of bits Rtwo,Ff,picType(i), for the current picture i. The rate control module 105 may also set a QP value for a first MB with index 0 in the current picture. The rate control module may set the QP value to the picture-level QP calculated for the same current picture, using an equation
Q(0)=Qtwo,Ff,picType(i). Equation (19)
Additionally, the rate control module 105 may set an initial virtual buffer fullness at the beginning of the current picture using an equation
in which bit_rate is the bit rate in bits per second, pic_rate is the picture rate in pictures per second and c is a constant equal to 4.
At step 233, the second encoding module 103 uses the QP to encode an MB in the second encoding pass. For instance, the second encoding module 103 may encode an MB with index j−1 of the current picture i. Thereafter, at step 234, the rate control module 105 updates the virtual buffer fullness using an equation
in which Bone(j) and Btwo(j) are, respectively, the number of bits generated from coding the current picture up to MB j in the first encoding pass and the second encoding pass.
At step 235, the rate control module 105 then sets the QP value for MB j of the current picture i in the second encoding pass proportional to the fullness of virtual buffer using an equation
Q(j)=[6×log2(51×(pic_rate/bit_rate)×d(j))+c]. Equation (22)
The QP value is adjusted based upon the virtual buffer fullness. Therefore, it is possible that identical MBs may be assigned different QP values, resulting in non-uniform picture quality. The method 230 is repeated to encode each MB in the picture and for subsequent pictures.
The method 240, as shown in
At step 241, as shown in
At step 242, the rate control module 105 calculates a QP value for MB j in the second encoding pass using an equation
At step 243, the second encoding module 103 uses qtwo,Ff,picType(i) to encode MB j in the second encoding pass. The second encoding module 103 outputs the output bitstream 110 at a rate of rtwo,Ff,picType(j). Given a current MB j of the picture i of picTypeε{I,P,Bs,B} in Ffε{frame, field}, the QP value and the new α value are calculated as follows.
At step 244, the rate control module 105 updates cumulative output bits for both the first encoding pass and the second encoding pass using an equation
in which {tilde over (r)}one,Ff,picType(j) and {tilde over (r)}two,Ff,picType(j) are respectively the cumulative coded bits up to MB j for the first encoding pass and the second encoding pass.
At step 245, the rate control module 105 updates average QP values for both the first encoding pass and the second encoding pass using an equation
in which
At step 246, the rate control module 105 updates the α value in the Q-R model using an equation
The initial α value can be set to 6, as shown in Equation (6), or any other reasonable value. In the case of MBAFF, αFf,picType(i) is updated per MB pair. {tilde over (r)}one,Ff,picType(j), {tilde over (r)}two,Ff,picType(j),
The method 250, as shown in
At step 251, the rate control module 105 determines ΔQPj(i). ΔQPj(i) is a normalized local activity measure for MB j in the picture i. The rate control module 105 also determines the total contribution of the normalized local activities for all MBs in the picture i to be equal to zero. The rate control module 105 may determine this condition using an equation,
According to an embodiment, the rate control module 105 determines actj(i), avg_act(i), and NMB. actj(i) is a spatial local activity measure for MB j of the picture i, avg_act(i) is an average spatial local activity of the picture i, and NMB is the total number of MBs for the picture i. avg_act(i) may be defined by equations,
Thereafter, the rate control module 105 determines ΔQPj(i) using an equation
in which β is a variable controlling the range of the local activity range. β may be, for example, set to a value of 2.
At step 252, the rate control module 105 modulates QPj(i) by the normalized local activity measure ΔQPj(i). QPj(i) is the QP value for MB j of the picture i. The rate control may determine a final QP value for MB(j) by modulating QPj(i) by a normalized local activity as
QPj(i)=QPj(i)+ΔQPj(i). Equation (30)
The final QPj(i) may need to be further clipped into an allowable QP value range of [0, 51].
6. Architecture of Encoding Modules in a Two-Pass MPEG-4 AVC Encoder
Both the first MPEG-4 AVC encoding module 310 and the second MPEG-4 AVC encoding module 320 include a circuit, for instance a processor executing computer code stored on a computer readable storage device, a memory, or an application specific integrated circuit (ASIC) configured to implement or execute one or more of the processes required to encode an input video sequence to generate an MPEG-4 AVC stream depicted in
The first MPEG-4 AVC encoding module 310 and the second MPEG-4 AVC encoding module 320 may comprise MPEG-4 AVC encoders. The first MPEG-4 AVC encoding module 310, and similarly the second MPEG-4 AVC encoding module 320, includes components that may be used to generate an MPEG-4 AVC stream. For instance, the first MPEG-4 AVC encoding module 310 may include a transformer 311, a quantizer 312, an entropy coder 313, a full-pel ME 314, and an org picture buffer 315.
By way of example, as shown in
By way of example, as shown in
As described above, the embodiments utilize a two-pass encoder, and rate control is improved by adjusting a QP value to meet the target bit rate for encoding the pictures in the second encoding pass. The QP value may be adjusted at a picture level or a macroblock level. Further, the QP value may be adjusted so that an actual bit rate closely approximates a target bit rate. Alternately, the QP value may be adjusted to constrain a bit rate within a virtual buffer based on a virtual buffer fullness.
Although described specifically throughout the entirety of the instant disclosure, representative embodiments of the present invention have utility over a wide range of applications, and the above discussion is not intended and should not be construed to be limiting, but is offered as an illustrative discussion of aspects of the invention. Also, the methods and system described herein are described with respect to encoding video sequences using MPEG-4 AVC by way of example. The methods and systems may be used to encode video sequences using other types of MPEG standards or standards that are not MPEG.
What has been described and illustrated herein are embodiments of the invention along with some of their variations. The terms, descriptions and figures used herein are set forth by way of illustration only and are not meant as limitations. Those skilled in the art will recognize that many variations are possible within the spirit and scope of the embodiments of the invention.
Number | Name | Date | Kind |
---|---|---|---|
5583573 | Asamura et al. | Dec 1996 | A |
5686962 | Chung et al. | Nov 1997 | A |
5978029 | Boice et al. | Nov 1999 | A |
6067118 | Chen et al. | May 2000 | A |
8107537 | Zhou et al. | Jan 2012 | B2 |
20020114393 | Vleeschouwer | Aug 2002 | A1 |
20020131494 | Fukuda et al. | Sep 2002 | A1 |
20080212677 | Chen et al. | Sep 2008 | A1 |
Entry |
---|
ProQuest—NPL—search—history. |
Number | Date | Country | |
---|---|---|---|
20110150076 A1 | Jun 2011 | US |