The present disclosure relates to a video processing device. More particularly, the present disclosure relates to a video processing device that codes coding tree unit levels of a video and a bit rate control method thereof.
By adjusting coding parameters of a video coder to control the bit rate of the video, the outputted video stream is able to meet the finite and time-varying bandwidth limit. Recently, a video compression standard of high efficiency video coding (HEVC) is ratified to improve video quality and high data compression rate. However, in current approaches, all smallest blocks able to be coded are required to be processed to adjust coding parameters accordingly. As a result, the computational complexity is too high, resulting in a difficulty of hardware implementation. Moreover, as wire delays may exist in hardware, actual coding progress may require longer processing time.
In some embodiments, a bit rate control method includes the following operations: receiving a first target bit of a video to be coded; determining a second target bit for a plurality of first coding tree units in a plurality of coding tree units of the video according to the first target bit; determining a fourth target bit of at least one fourth coding tree unit in the plurality of coding tree units according to an actual bit of at least one second coding tree unit in the plurality of coding tree units and a third target bit of at least one third coding tree unit in the plurality of coding tree units, wherein the at least one second coding tree unit is completely coded, the at least one third coding tree unit is not completely coded, and a coding of the at least one fourth coding tree unit is not started; and sequentially adjusting at least one coding parameter for coding the video according to the second target bit, the third target bit, and the fourth target bit.
In some embodiments, a video processing device includes a memory circuit and a processor circuit. The memory circuit is configured to store at least one program code. The processor circuit is configured to execute the at least one program code, in order to: receive a first target bit of a video to be coded; determine a second target bit for a plurality of first coding tree units in a plurality of coding tree units of the video according to the first target bit; determine a fourth target bit of at least one fourth coding tree unit in the plurality of coding tree units according to an actual bit of at least one second coding tree unit in the plurality of coding tree units and a third target bit of at least one third coding tree unit in the plurality of coding tree units, wherein the at least one second coding tree unit is not completely coded, the at least one third coding tree unit is completely coded, and a coding of the at least one fourth coding tree unit is not started; and sequentially adjust at least one coding parameter for coding the video according to the second target bit, the third target bit, and the fourth target bit.
These and other objectives of the present disclosure will no doubt become obvious to those of ordinary skill in the art after reading the following detailed description of the preferred embodiments that are illustrated in the various figures and drawings.
The terms used in this specification generally have their ordinary meanings in the art and in the specific context where each term is used. The use of examples in this specification, including examples of any terms discussed herein, is illustrative only, and in no way limits the scope and meaning of the disclosure or of any exemplified term. Likewise, the present disclosure is not limited to various embodiments given in this specification.
In this document, the term “coupled” may also be termed as “electrically coupled,” and the term “connected” may be termed as “electrically connected.” “Coupled” and “connected” may mean “directly coupled” and “directly connected” respectively, or “indirectly coupled” and “indirectly connected” respectively. “Coupled” and “connected” may also be used to indicate that two or more elements cooperate or interact with each other. In this document, the term “circuit” may indicate an object, which is formed with one or more transistors and/or one or more active/passive elements based on a specific arrangement, for processing signals.
As used herein, the term “and/or” includes any and all combinations of one or more of the associated listed items. Although the terms “first,” “second,” etc., may be used herein to describe various elements, these elements should not be limited by these terms. These terms are used to distinguish one element from another. For example, a first element could be termed a second element, and, similarly, a second element could be termed a first element, without departing from the scope of the embodiments. For ease of understanding, like elements in various figures are designated with the same reference number.
Following descriptions include various technical terminologies about video coding. Person skilled in the art should understand relevancy between these technical terminologies and the video coding and/or related configurations.
In the R-λ model, it is assumed that a bit rate R of a video to be coded and a Lagrange multiplier λ are satisfied with the following equation:
R=α·λβ (1)
Where α and β are parameters associated with video coding.
Based on the equation (1), the bit rate control method 100 includes multiple operations. In operation S110, bits are allocated to levels of the video to be coded. In some embodiments, the levels, in a top-to-bottom order, may be a group of pictures (GOP) level, a picture level, and a coding tree unit (CTU) level. The GOP level may be successive pictures in one frame of the video. The picture level may be one of the successive pictures. The CTU level is a block processed by a processing unit in one picture. In some embodiments, the CTU level is a minimum block of a single picture that is able to be processed by a processing unit. The bit rate control of the levels is described in the following paragraphs.
Assumed that a target bit (or a target bit rate) of the video is Rtar and a frame rate of the video is f, an average bit rate PPicAvg per frame of the video can be derived with the following equation (2):
RPicAvg=Rtar/f (2)
If a number of frame(s) that are coded is Ncoded and the bit cost on those frames is Rcoded, a target bit TGOP allocated in the current GOP level is determined as follows:
SW is a size of a smooth window which makes the bit rate change smoother. TAvgPic is a number of bits cost by the smooth widow per picture. NGOP is a number of pictures in the current GOP level.
Assumed that a number of the coded bit(s) (i.e., bit cost) in the current GOP level is CodedGOP and that ωCurrPic is the weight of each picture in the current GOP level, a target bit TCurrPic of the current coded picture can be determined as follows:
The denominator of the equation (5) is a sum of weights of pictures that are not coded, in which the determination of the weight ωCurrPic can be understood with reference to the above related document.
When the control of the bit rate of the CTU level is enabled, a target bit TCurrCTU of each CTU is determined as follows:
A denominator of the equation (6) is a sum of weights of CTUs that are not coded, and Bitheader is a number of estimated bits of all headers in the pictures. The above headers may include a slice header, a video parameter set (VPS), a sequence parameter set (SPS), a picture parameter set (PPS), etc. Bitheader may be estimated according to actual bits of the headers in the coded pictures. CodedPic is a number of coded bits (i.e., bit cost) in the video (or picture). The determination of a weight ωCurrCTU of the CTU can be understood with reference to the above related document. In some embodiments, the weight ωCurrCTU of the CTU may be estimated according to prediction error(s) in the previously coded picture(s). The prediction error(s) may be determined with a calculation of mean absolute difference (MAD). In some embodiments, when an image texture or a color correspond to the CTU is more complicated, the weight ωCurrCTU of the CTU is higher.
In operation S120, coding parameters for coding video are determined. For example, in order to reach a bit rate R of the video, it is required to determine a corresponding Lagrange multiplier λ. On condition that the bit rate R is known to be the target bit Rtar, it is able to determine an estimated Lagrange multiplier λpred with the equation (1) as follows:
T is a target bit TCurrPic of the picture (or is a target bit TCurrCTU of the CTU), w and h are the width and the height of the current picture (or the current CTU) respectively. In some embodiments, with certain number of tests, it is able to derive that a quantization parameter QP can be expressed as the following equation (8):
QP=4.2005 ln λ+13.7122 (8)
The quantization parameter QP is the coding parameter for coding video and is to indicate the compression level of the video. If the value of the quantization parameter QP is lower, the bit rate of the video is higher, and the quality of the pictures is better.
The values in the above equation (8) are given for illustrative purposes, and the present disclosure is not limited thereto. In some embodiments, when a bit rate control method 200 in
QP=4.8005 ln λ+13.7122
In operation S130, parameters of the R-λ model are updated. After one CTU is coded, a parameter α and a parameter β of the R-λ model can be updated according to actual coded bits and the Lagrange multiplier λ, in order to code the remaining video. For example, when actual bits Tactual of the coded picture (or the coded CTU) are obtained, the updated Lagrange multiplier λ, the updated parameter αupdate, and the updated parameter βupdate can be determined with the following equations:
In some embodiments, an initial value of the parameter α may be 3.2003, and a range of the parameter α may be between 0.05 and 20. In some embodiments, an initial value of the parameter β may be −1.367, and a range of the parameter β may be between −3 and −0.1. In some embodiments, the parameter δα and the parameter δβ may be fixed values associated with the target bits Rtar. For example, the parameter δα and the parameter δβ may set according to a target bit per pixel in the picture of the video. If the target bit of each pixel is less than 0.03, the parameter δα may be about 0.01 and the parameter δβ may be 0.005. If the target bit of each pixel is greater than or equal to 0.03 and less than 0.08, the parameter δα may be about 0.05 and the parameter δβ may be 0.025. If the target bits of each pixel are greater than or equal to 0.03 and less than 0.2, the parameter δα may be about 0.1 and the parameter δβ may be 0.05. If the target bit of each pixel is greater than or equal to 0.02 and less than 0.5, the parameter δα may be about 0.2 and the parameter δβ may be 0.1. If the target bit of each pixel is greater than 0.5, the parameter δα may be about 0.4 and the parameter δβ may be about 0.2. The above values are given for illustrative purposes, and the present disclosure is not limited thereto.
In operation S210, a target bit of the video to be coded is received. In operation S220, a target bit of a number of the CTUs in the video is determined together according to the received target bit. Partial details of operations S210 and S220 can be understood with reference to operation S110 in
In order to understand operation S220, reference is now made to
Compared with
The target bit TmCTU of the picture block M1 can be considered as the target bit for each of the CTUs C0 and C1. As a result, a number of calculations of target bits for the CTUs can be reduced. In some embodiments, the weight ωmCTU is a sum of weights of the CTUs included in the picture block M1. For example, in view of the picture block M1, the weight ωmCTU is a sum of the weight ωCurrCTU of the CTU C0 and the weight ωCurrCTU of the CTU C1. By this analogy, a target bit of each picture blocks M1 can be sequentially determined with the equation (12) (i.e., steps S3-9, S3-11, and S3-13; which correspond to operation S220 in
The above examples are described with the picture block M1 including two CTUs for illustrative purposes, but the present disclosure is not limited thereto. In some embodiments, a number of CTUs included in the picture block M1 can be set within a predetermined range. In other words, the number of the CTUs included in the picture block M1 is variable.
With continued reference to
As mentioned above, in the equation (12) (or the equation (6)), the CodedPic is the number of coded actual bits (i.e., bit costs) in the video. Generally, CodedPic is a sum of actual bits of coded CTUs in the video. On condition that wire delay exists in the hardware, certain CTUs are not completely coded, and thus actual bits of theses CTUs cannot be known. As a result, the number of coded bits CodedPic cannot be updated promptly, and thus it is unable to determine a target bit of the CTU which has not started being coded. Under this condition, higher hardware specifications are required or more processing time for coding the video is required.
In order to understand operation S230, reference is made to
CodedPic=Tactual(C0-C7)+TmCTU(C8,C9) (13)
In the equation (3), the target bit TmCTU(C8, C9) of the picture block M1 (which includes the CTUs C8 and C9) that is not completely coded can be determined by using the equation (12) in advance (i.e., step S4-1). By utilizing the estimated target bit to replace the actual bits of the CTUs that are not completely coded, it is able to instantly update (or predict) the number of the coded bits in video (i.e., the value of CodedPic), in order to determine the target bit of the picture block M1 that has not started being coded (i.e., step S4-2). As a result, impacts from the wire (or pipeline) delay can be reduced.
With continued reference to
The above description of the bit rate control method 200 includes exemplary operations, but the operations of the bit rate control method 200 are not necessarily performed in the order described above. Operations of the bit rate control method 200 can be added, replaced, changed order, and/or eliminated, or the operations of the bit rate control method 200 can be executed simultaneously or partially simultaneously as appropriate, in accordance with the spirit and scope of various embodiments of the present disclosure.
The video processing device 600 includes a processor circuit 610, a memory circuit 620, and one or more input/output (I/O) interfaces 630. The processor circuit 610 is coupled to the memory circuit 620 and the I/O interfaces 630. In various embodiments, the processor circuit 610 may be a central processor unit (CPU), an application-specific integrated circuit (ASIC), a multi-processor, a pipeline processor, a distributed processing system, and/or a picture processor circuit. Various circuits or units to implement the processor circuit 610 are within the contemplated scope of the present disclosure. In some embodiments, the processor circuit 610 may operate as one or more video coders, and may analyze the video SV to acquire multiple parameters or values of the R-λ model, in order to control the bit rate of the video SV.
In some embodiments, the memory circuit 620 may operate as a data buffer, in order to temporarily store various data generated in the course when the processor circuit 610 executes the bit rate control method 200. In some embodiments, the memory circuit 620 stores one or more program codes which are for coding the video SV. For example, the program codes may be encoded with multiple instruction sets that are for coding the video SV and for controlling the bit rate of the coded video SV. The processor circuit 610 may execute the program codes store in the memory circuit 620, in order to perform operations for controlling the bit rate of the video SV (e.g., operations in
In some embodiments, the memory circuit 620 may be a non-transitory computer readable storage medium that is configured to store multiple instruction sets for control the bit rate of the video SV. For example, the memory circuit 620 stores multiple executable instructions for performing operations in
The I/O interfaces 630 receive inputs or commands from various control devices, which, for example, are operated by a user. Accordingly, the processor circuit 610 is able to be manipulated with the inputs or commands received by the/O interfaces 630. For example, the user is able to input information about the target bit Rtar of the video SV via the I/O interfaces 630, in order to provide information to the processor circuit 610 for performing the bit rate control method 200. In some embodiments, the I/O interfaces 630 are able to receive the video SV, and to transmit the same to the processor circuit 610 for subsequent processing.
In some embodiments, the I/O interfaces 630 include a display configured to display the coded content of the video SV. In some embodiments, the I/O interfaces 630 include a graphical user interface (GUI). In some embodiments, the I/O interfaces 630 include a keyboard, keypad, mouse, trackball, track-pad, touch screen, cursor direction keys, or the combination thereof, for communicating information and commands to processor circuit 610.
As described above, the bit rate control method and the video processing device provided in some embodiments of the present disclosure are able to determine a target bit for multiple CTUs, and to utilize the target bit to estimate the coded actual bits, in order to reduce the hardware complexity. Such bit rate control mechanism is in favor of lowering the hardware specification, and thus provides the ease of hardware implementation.
Various functional components or blocks have been described herein. As will be appreciated by persons skilled in the art, in some embodiments, the functional blocks will preferably be implemented through circuits (either dedicated circuits, or general purpose circuits, which operate under the control of one or more processors and coded instructions), which will typically comprise transistors or other circuit elements that are configured in such a way as to control the operation of the circuitry in accordance with the functions and operations described herein. As will be further appreciated, the specific structure or interconnections of the circuit elements will typically be determined by a compiler, such as a register transfer language (RTL) compiler. RTL compilers operate upon scripts that closely resemble assembly language code, to compile the script into a form that is used for the layout or fabrication of the ultimate circuitry. Indeed, RTL is well known for its role and use in the facilitation of the design process of electronic and digital systems.
The aforementioned descriptions represent merely some embodiments of the present disclosure, without any intention to limit the scope of the present disclosure thereto. Various equivalent changes, alterations, or modifications based on the claims of present disclosure are all consequently viewed as being embraced by the scope of the present disclosure.
Number | Date | Country | Kind |
---|---|---|---|
202010224026.X | Mar 2020 | CN | national |
Number | Name | Date | Kind |
---|---|---|---|
7965768 | Ito | Jun 2011 | B2 |
20050169370 | Lee | Aug 2005 | A1 |
20110069754 | Wang | Mar 2011 | A1 |
20170295368 | Teng | Oct 2017 | A1 |
20200128253 | Zhou | Apr 2020 | A1 |
Number | Date | Country |
---|---|---|
201737707 | Oct 2017 | TW |
Entry |
---|
B. Li, H. Li, L. Li, and J. Zhang, (“Rate control by R-lambda model for HEVC,” document JCTVC-K0103, Joint Collaborative Team on Video Coding (JCT-VC) of ITU-T SG 16 WP 3 and ISO/IEC, 11th Meeting: Shanghai, China, Oct. 10-19, 2012) (Year: 2012). |
OA letter of the counterpart TW application (appl. No. 109111998) dated Apr. 16, 2021. Summary of the OA letter: 1 Claims 1, 5, and 7 are rejected as allegedly being unpatentable over first cited referecnce (TW 201737707 A, also published as US 2017/0295368A1) in view of second circuited reference (US 2011/0069754 A1). 2. Claims 2-4, 6, and 8-10 are allowable. |
B. Li, H. Li, L. Li, and J. Zhang, “Rate control by R-lambda model for HEVC,” document JCTVC-K0103, Joint Collaborative Team on Video Coding (JCT-VC) of ITU-T SG 16 WP 3 and ISO/IEC, 11th Meeting: Shanghai, China, Oct. 10-19, 2012. |
Number | Date | Country | |
---|---|---|---|
20210306677 A1 | Sep 2021 | US |