The present invention relates to bit allocation and rate control in a video encoding system. Particularly, the invention relates to a method of bit allocation following a scene change in a video data sequence.
To minimise implementation complexity, the bit allocation scheme in the MPEG-2 Test Model-5 (TM-5) makes use of a complexity measure of a present picture of a certain type (I, P, or B) to estimate the target bit allocation of the next picture of the same type. After a picture of a certain type (I, P, or B) is encoded, the respective “global complexity measure” (Xi, Xp, or Xb) is updated as:
Xi=SiQi (1)
Xp=SpQp (2)
Xb=SbQb (3)
where Si, Sp, Sb are the number of bits generated by encoding this picture and Qi, Qp, and Qb are the average quantization parameters computed by averaging the actual quantization values used during the encoding of all the macroblocks, including skipped macroblocks.
The target number of bits (Ti, Tp, or Tb) for the next picture in a group of pictures (GOP) is computed as:
Ti=max{R/(1+NpXp/XiKp+NbXb/XiKb), bit_rate/(8*picture_rate)} (4)
Tp=max{R/(Np+NbKpXb/KbXp), bit_rate/(8*picture_rate)} (5)
Tb=max{R/(Nb+NpKbXp/KpXb), bit_rate/(8*picture_rate)} (6)
where:
The TM-5 bit allocation method is described fully in “International Organisation for Standardisation ISO-IEC/JTCI/SC29/WG11: Coded Representation of Picture and Audio Information, Test Model 5”.
In the TM-5 bit allocation method, using the current picture complexity to estimate the next picture bit allocation gives rise to inaccuracies when a scene change occurs in the next picture.
The bit allocation method used in TM-5 to calculate the target number of bits for the next picture is based on the global complexity measure (Xi, Xp, and Xb) of the current picture, which can result in an inaccurate prediction of the target number of bits when a scene change occurs. Referring to
But since the target number of bits for the P-picture after the scene change is based on the complexity measure of the P-picture before the scene change, the target bits for P1 may be inaccurately predicted. Likewise for the P-picture following P1, the target will be inaccurately predicted as being much higher due to the high complexity measure of P1 being used. The effect of this is that a degradation of picture quality in the several pictures after the scene change will propagate from the encoding error in the first P-picture. To address the problems associated with the scene change situation, an intra-encoding mode can be used to encode a picture which includes a new scene, such as described in U.S. Pat. No. 5,532,746. When a scene change is detected in a P-picture, the picture is allocated with bits corresponding to the intra-coded picture and the picture is coded in an intra-mode. In U.S. Pat. No. 5,832,121, the start of a GOP is determined by detecting a new scene in order to intra-frame code the new scene. These kinds of methods do not have a fixed number of pictures in a GOP during encoding of a sequence, and are not suitable to use for an encoder system which requires a fixed number of pictures in a GOP.
Some methods avoid picture degradation in the scene change situation by allocating more bits to encode the new scene picture. In U.S. Pat. No. 5,731,835, when a scene change occurs, extra bits are allocated to encode the picture, where the number of bits depends on 1) the distance of the P-picture that is about to be coded from the end group of pictures; 2) the number of I-mode macroblocks in the P-picture; 3) ease of coding the I-mode macroblocks in the P-picture; and 4) the number of bits presently in the video buffer verifier (VBV). In this method, the situations where multiple scene changes occur close together may require a significant number of extra bits, and as the extra number of bits is only limited by the potential of a VBV underflow occurrence, it could increase the chance of panic-mode encoding. In panic-mode encoding, only the minimum amount of data required to maintain the integrity of the bitstream is transmitted, which significantly degrades the picture encoding quality.
In U.S. Pat. No. 5,617,150, SUBGOPs are defined as sets of 2-4 frames in a GOP, and if a scene change is found in a subsequent SUBGOP, bit allocation adjustment is made to save bits in the current SUBGOP, and when a scene change is found in the current SUBGOP, extra bits are assigned to the current SUBGOP. As this method requires detection of a scene change in a subsequent SUBGOP before bits assignment, there is an undesirable encoding latency. Also, as bits are saved in the current SUBGOP so as to be used in next SUBGOP, it is only suitable for constant bit-rate applications, as using bits saved from previous pictures is not necessary in variable bit-rate applications.
When a scene change occurs, the first P-picture after the scene change will generally be difficult to encode and will need more bits because of the intra-mode coding required. B-pictures are not affected as much due to the usual bi-directional motion estimation thereof. In addition, because the error in the P-picture will propagate, it is important to minimise the degradation in quality of the first P-picture after a scene change.
It is an object of this invention that the above inaccurate bit allocation estimation problems of TM-5 are addressed by correcting the picture complexity value that is used to estimate the bit allocation for the next picture, or that it at least provides a useful alternative. This inaccuracy problem applies to the two subsequent P-pictures that follows the scene change as both of these P-pictures have inaccurate picture complexity values used for their bit allocation assignment. By adjusting the picture complexity value following the scene change detection to give a more accurate bit allocation to the two P-pictures after the scene change, this method can be used for encoder bit-rate control for both constant bit-rate and also variable bit-rate schemes. It should be noted that reducing the'bit allocation for other pictures to compensate for the increase in bit allocation for scene change is not necessary for variable bit-rate applications.
It is noted that for multiple scene changes close together, for example scene changes which occur every 2 to 3 frames, the increased target bit allocation for every scene change is not realistic in constant bit-rate applications. Also, human visual acuity is generally not able to sensibly interpret such an apparently ‘continuous’ scene change.
The present invention provides a method of bit allocation for use in a video encoding system adapted to encode video data representing a group of pictures (GOP), the method including the steps of:
The present invention further provides a video encoding system adapted to encode video data including a group of pictures (GOP) the system including:
Preferably, operation of the picture complexity measurements and the bit allocation estimation is similar to the MPEG-2 TM-5 when there is no scene change detected, yet when a scene change is detected, the picture complexity that is to be used to compute the bit allocation of the subsequent two P-pictures after the scene change is adjusted so that their bit allocation is more accurately estimated.
It is considered that an increased target bit allocation for every scene change in a close multiple scene change situation is unnecessary and inefficient for constant bit-rate application, so preferably the above picture complexity adjustment is only applied to scene changes that are at least a certain period of time apart. For example, if the number of frames in between two scene changes is less than a pre-determined value, then the picture complexity adjustment is not done.
A video encoder system, as shown in
Referring to
If the scene change flag is not set but the previous scene change flag is set (step 309), then if the picture is a P-picture (step 310), the complexity measure for that picture is reduced by the factor M or K at step 311 and the previous scene change flag is reset at step 312.
The values of M and K can be of an experimentally determined fixed value, for example values of 2 and 1.5. The distance threshold value D can be a predetermined value such as 10 or 15.
Advantageously, the present invention is able to correct the inaccurate bit allocation of the TM-5 model when a scene change occurs, such that the corrected bit allocation for the two subsequent P-pictures results in a better picture quality. The invention can be applied to both constant bit-rate and variable bit-rate control and there is no latency as it does not require checking future pictures to determine whether a scene change occurs.
Advantageously, the invention avoids the situation of unnecessarily allocating extra bits because of multiple scene changes that are close together and the chance of panic-mode encoding due to potential VBV underflow is reduced.
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/SG00/00175 | 10/6/2000 | WO | 00 | 9/4/2003 |
Publishing Document | Publishing Date | Country | Kind |
---|---|---|---|
WO02/30126 | 4/11/2002 | WO | A |
Number | Name | Date | Kind |
---|---|---|---|
5532746 | Chang | Jul 1996 | A |
5617150 | Nam et al. | Apr 1997 | A |
5731835 | Kuchibholta | Mar 1998 | A |
5832121 | Ando | Nov 1998 | A |
6100931 | Mihara | Aug 2000 | A |
6173012 | Katta et al. | Jan 2001 | B1 |
6535251 | Ribas-Corbera | Mar 2003 | B1 |
6621866 | Florencio et al. | Sep 2003 | B1 |
7277483 | Eckart | Oct 2007 | B1 |
Entry |
---|
Luo, L. et al., “A New Algorithm on MPEG-2 Target Bit-Number Allocation at Scene Changes,” IEEE Trans. on Circuits and Systems for Video Tech., 7(5):815-819, Oct. 1997. |
Luo et al., “A New Algorithm on MPEG-2 Target Bit-Number Allocation at Scene Changes”; IEEE Transactions on Circuits and Systems for Video Technology, vol. 7, No. 5, Oct. 1997, pp. 815-819. |
Nam, J. et al., “An Adaptive Rate Control Scheme with Scene Change Detection,” in Proceedings of The International Workshop on HDTV, 1995, pp. 99-107, XP-000965384. |
Yu, Y. et al., “A Fast Effective Scene Change Detection and Adaptive Rate Control Algorithm,” in Proceedings of IPCIP '98 Int'l Conf. on Image Processing, Chicago, IL, Oct. 4-7, 1998, pp. 379-382. |
“International Organisation for Standardisation Organisation Internationale de Normalisation,” Telecommunication Stadardization Sector, Document AVC-491b, Version 2, Apr. 1993, 118 pp. |