The invention is related to video encoding and more particularly to I-frame flicker artifact removal where video is coded into Groups-of-Pictures (GOPs).
When playing out a GOP coded video, annoying pulsing, or the so called flickering artifact will usually be seen at the periodic I-frames for the GOPs in the same scene. Especially for low or medium bit rate video coding, this I-frame flickering is very obviously seen, which greatly compromises the overall perceptual quality of the coded video.
Original video signals have naturally smooth optical flows. However, after poor quality video encoding, the natural optical flow will be distorted in the coded video signals. The resultant temporal inconsistency/incoherence across coded frames will then be perceived as the flickering artifact. In practice, flickering is more often perceived at static or low motion areas/portions of a coded video. For example, several consecutive frames may share the same static background. Hence, all the collocated pixels in the static background across these frames bear the same or similar pixel values in the original input video. However, in video encoding, the collocated pixels may be predicted from different reference pixels in different frames, and hence after quantizing the residue, yield different reconstruction values. Visually, the increased inter-frame differences across these frames will be perceived as flickering during coded video playing out.
As such, a flickering artifact is more intensive for low or medium bit rate coding due to coarse quantization. Also, it is more obviously observed on I-frames than on P or B-frames. This is mainly because for the same static areas, prediction residue resultant from inter-frame prediction in P- or B-frames is usually much smaller than that resultant from intra-frame prediction or no-prediction in I-frames. Thus, with coarse quantization, the reconstructed static areas in an I-frame demonstrate more noticeable difference from the collocated areas in previous P- or B-frames, and hence, more noticeable flickering artifact. Therefore, how to eliminate I-frame flickering is a critical issue that greatly affects the overall perceptual video coding quality.
Most of the existing encoder-based I-frame deflicker schemes are designed for the GOP-sequential single-thread video coding case, where when coding an I-frame, its immediate previous frame has already been coded. Hence, one can readily use the reconstructed previous frame to derive the no flicker reference for the current frame, which can then be used for deflickering of the current I-frame.
Using multiple encoding threads, instead of one single thread, is a commonly used effective parallelization strategy to greatly accelerate the computationally intensive video coding process in real-time video coding systems. While multi-threads may be exploited in many various ways in practice, one straightforward, and hence, commonly adopted approach is to let multiple threads encode multiple GOPs respectively and simultaneously. This is the scenario for GOP-parallel coding. Note that throughout this description, the terms “GOP-parallel” and “multi-thread” will be used exchangeably, and “GOP-sequential” and “single-thread” will likewise be used exchangeably.
Multi-thread coding renders I-frame flicker removal a much more challenging task than that in the case of GOP-sequential single-thread coding. In single-thread coding, when coding an I-frame, the frame immediately before it has already been coded, whose reconstruction can be readily exploited to derive a good no flicker reference for deflicker coding of the current I-frame (for example, via exhaustive or simplified P-frame coding for the first coding pass). However, in the GOP-parallel multi-thread coding case, it is most likely that when coding an I-frame, its immediate previous frame might not be coded yet, as the two frames may belong to two different GOPs which are coded by two different coding threads. In this case, one solution is to use the coded frame in the previous GOP that is closest to the current I-frame to generate its no flicker reference for deflickering. However, if that frame is too far away from the current frame, such that the two frames are not well correlated, a good no flicker reference might not be derived from that frame, and hence, adequate flicker removal might not be achieved.
Generally, I-frame flickering as well as any other coding artifact can be removed or reduced either by properly modifying the encoding process or by adding some effective post-processing at the decoder. However, post-processing based de-flickering is often not a good solution in practical video coding applications, as a coded video bitstream may be decoded by decoders/players from a variety of different manufacturers, some of which may not employ the specific post-processing technique (e.g. in order to reduce the product cost).
A method of encoding video is presented in which multiple groups of pictures (GOPs) are formed and encoded in parallel threads. Each encoded GOP has an initial I-frame followed by a series of P-frames. Each I-frame is deflicker coded with a first derived no flicker reference from the nearest coded frame of a preceding GOP and, the last P-frame in the series of the preceding GOP is deflicker coded with a second derived no flicker reference from the deflicker coded I-frame. Small quantization parameters (QPs) can be employed in coding the I-frame to closely approach the first no flicker reference. Medium QPs can be employed in coding the last P-frame. In the method, the first derived no flicker reference can be generated by a one pass simplified P-frame coding. The simplified p-frame coding can comprise the step of applying a larger motion search range for a low correlation between the I-frame and the nearest coded frame in the preceding GOP. The simplified p-frame coding can also comprise the step of applying a smaller motion search range for a high correlation between the I-frame and the nearest coded frame in the preceding GOP or comprise forgoing skip mode checking in mode selection, wherein the correlation can be determined by sum inter-frame complexity or can be determined by sum inter-frame complexity. The simplified p-frame coding could also comprise the step of checking only P16×16 mode, using smaller motion search range, and coding distortion matching between the current frame MB and the prediction reference MB, and modifying RD cost in RDO-MS, thereby preventing or discouraging skip and intra modes.
The invention will now be described by way of example with reference to the accompanying figures of which:
In the GOP-parallel multi-thread video coding scenario, a GOP starts with an IDR frame and ends with a P-frame. Note that inter-GOP prediction, i.e. prediction across GOP boundaries, although rendering more or less improved coding efficiency, is difficult to be supported in this GOP-parallel multi-thread coding architecture. Therefore, the above assumption generally always holds true. Without loss of generality, it is assumed that each GOP only has one I-frame, which is also its 1st frame.
In the following description, the focus is on the coding of two consecutive GOPs in the same scene, and hence, deflicker for the 1st I-frame of the 2nd GOP. The 1st I-frame of the 1st and 2nd GOP as “I_curr” and “I_next”, respectively. We denote the last P-frame in the 1st GOP is denoted as “P_last”. Without loss of generality, it is assumed that the two GOPs are coded separately by two different encoding threads, and when one thread is about to start coding I_next, another thread only partially encodes the preceding GOP. The coded frame in the 1st GOP that has the highest display order is denoted as “P_curr”. Note that the frame of P_curr actually could be of any frame type other than I-frame. Herein, the use of P_curr is purely for notation convenience. Also note that P_curr is just the coded frame in the preceding GOP that is closest to I_next. These notations are as illustrated in
Referring to
Besides new deflicker coding of I_next 18, the 2nd technique in our solution is the proposed deflicker coding of P_last 14. In multi-thread coding, it is highly likely that: when a thread is about to code the last frame in the current GOP, i.e. P_last 14, the first I-frame in the next GOP, i.e. I_next 18, has already been coded by another thread. In this case, we propose to conduct deflicker coding for P_last 14 as well. Note that in I_next 18 deflicker coding, a lot more bits are often allocated to the frame such that I_next 18 can be coded with small quantization parameters (QPs) and hence closely approach its no flicker reference. However, in the new P_last deflicker coding, closely approaching the no flicker reference is not desirable any more. This is because: although P_last 14 and I_next 18 may be highly correlated, P_curr 12 and P_last 14 might not be, and thus, temporal incoherence, i.e. flicker, artifacts may exist between the preceding frame of P_last 14 and I_next 18. Therefore, in this case, it is more preferable for P_last 14 to well balance between its coded preceding frame and the coded I_next 18 for the best overall deflicker performance, rather than closely approach the no flicker reference derived from either its coded preceding frame or the coded I_next 18. Therefore, in the proposed deflicker coding scheme of P_last 14, its no flicker reference is still derived from the coded I_next 18 via the same newly proposed simplified P-frame coding as in I_next deflicker coding. However, only a moderate amount of additional bits are further allocated to P_last 14. Thus, the resultant reconstructed P_last 14 represents a proper mixture of its preceding frame and I_next 18, which renders a more smooth transition between them.
The overall proposed deflicker solution and the desired deflicker performance are illustrated in
The implementation of the proposed deflicker scheme is explained in further detail in
In the implementation, deflicker_buffer is an important and useful buffering mechanism that helps all the multiple threads buffer and share their coding results for I_next 18 or P_last 14 deflickering. In our current implementation, deflicker_buffer includes three parts:
Cmp1=
Herein, Cmp1 denotes the complexity of the latter frame.
At step 50, if SaveCurrFrm is true, an I_next 18 or a P_last 14 frame will be recorded in deflicker_frm_buffer for later on deflicker coding of P_last 14 or I_next 18, respectively at step 54. Otherwise, if the current coded frame is a so far most useful frame for I_next deflickering, the current frame results will be recorded into prev_form_buffer[curr_thread_ID] at steps 52, 53, which later on will be loaded as P_curr for I_next deflicker. Note that one needs to buffer the current frame results, only when all the four conditions in
Also, note that simplified RDO-MS in P_last or I_next MB no flicker generation both involve modified RD cost for each candidate mode, which is also critical for the ultimate remarkable and reliable deflicker performance. Basically, via modifying the RD cost in RDO-MS, Skip and Intra modes are more discouraged, while Inter-prediction modes are more favorable. This proves to be an effective means for better deflicker performance. Specifically, in no flicker reference generation, RD costs of Inter modes are multiplied by 0.7 for increased preference and for P_last MBs, in both no flicker reference generation and actual coding, RD costs of Intra modes are multiplied by 2.5 for reduced preference.
Last but not least, as mentioned earlier, rate control has to coordinate with deflicker coding of I_next 18 and P_last 14 well. Basically, in frame-level rate control, a lot more bits need to be allocated for I_next deflickering, while a moderate amount of more bits need to be allocated for P_last deflickering. This usually can be achieved by assigning proper QP offsets for a frame when conducting frame-level bit allocation. In our current implementation, we assign −6 and −2 for I_next and P_last QP offsets respectively.
Experiments have been done to evaluate the performance of the above proposed GOP-parallel multi-thread deflicker solution. Results show that the proposed scheme is able to effectively reduce I-frame flickering artifacts in the multi-thread coding case, while the incurred additional computational complexity does not pose a serious challenge for the accomplishment of real-time coding. Especially, we found that shorter GOP lengths (e.g. <60) are more desirable for better deflicker performance than larger GOP lengths (e.g. >90), as with shorter GOP lengths, the distance between P_curr 12 and I_next 18 will more likely to be short as well, which is highly favorable for good deflickering.
Herein, provided are one or more implementations having particular features and aspects. However, features and aspects of described implementations may also be adapted for other implementations. For example, implementations may be performed using one, two, or more passes, even if described herein with reference to particular number of passes. Additionally, the QP may vary for a given picture or frame, such as, for example, varying based on the characteristics of the MB. Although implementations described herein may be described in a particular context, such descriptions should in no way be taken as limiting the features and concepts to such implementations or contexts.
The implementations described herein may be implemented in, for example, a method or process, an apparatus, or a software program. Even if only discussed in the context of a single form of implementation (for example, discussed only as a method), the implementation or features discussed may also be implemented in other forms (for example, an apparatus or program). An apparatus may be implemented in, for example, appropriate hardware, software, and firmware. The methods may be implemented in, for example, an apparatus such as, for example, a computer or other processing device. Additionally, the methods may be implemented by instructions being performed by a processing device or other apparatus, and such instructions may be stored on a computer readable medium such as, for example, a CD, or other computer readable storage device, or an integrated circuit. Further, a computer readable medium may store the data values produced by an implementation.
As should be evident to one of skill in the art, implementations may also produce a signal formatted to carry information that may be, for example, stored or transmitted. The information may include, for example, instructions for performing a method, or data produced by one of the described implementations.
Additionally, many implementations may be implemented in one or more of an encoder, a pre-processor for an encoder, a decoder, or a post-processor for a decoder.
Further, other implementations are contemplated by this disclosure. For example, additional implementations may be created by combining, deleting, modifying, or supplementing various features of the disclosed implementations.
The following list provides a short list of various implementations. The list is not intended to be exhaustive but merely to provide a short description of a small number of the many possible implementations as follows:
The embodiments described present an effective I-frame deflicker scheme for GOP-parallel multi-thread video encoding. The proposed scheme can reduce the impact of the unavailability of the reconstructed immediate previous frame on the current I-frame deflickering. The scheme is also efficient, as it incurs marginal additional computation and memory cost, and thus, fits very well in a real-time video coding system.
In sum, presented herein is a means of properly changing an encoder and its method of encoding in a more direct and general way to solve the various artifact removal problems discussed above.
While some schemes address the deflicker problem for all Intra-frame coded video, either with the Motion JPEG2000 standard, or with the H.264/AVC standard, at least one implementation in this disclosure provides a deflicker solution that is compatible with the main-stream video coding standards, i.e. the well-know hybrid coding paradigm with motion compensation and transform coding. Moreover, this application is concerned with GOP coded video, where each GOP starts with an I-frame.
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/US09/06056 | 11/10/2009 | WO | 00 | 5/12/2011 |
Number | Date | Country | |
---|---|---|---|
Parent | 61199028 | Nov 2008 | US |
Child | 12998643 | US |