Embodiments according to the invention relate to apparatuses and methods for improving Rate Control (RC) Algorithms, e.g., in a Video Encoder and/or Video Decoder, using a Low-Complexity Model of the Human Visual System
Rate control (RC) methods play an important role in the production and distribution of compressed video content. RC solutions, which are typically implemented as single-pass or two-pass algorithms, ensure that a given input video sequence—the video material to be compressed—ends up consuming a specific number of bits (or a bit count within, say, ±1% of a specified value) after compression into a bitstream. This behavior is essential since the number of bits entailed for compression to a certain level of objective fidelity (e.g., average PSNR value) or subjective quality (e.g., mean opinion score, MOS) varies with the statistics of the input video signal's samples. In other words, RC methods, operating input adaptively, turn the inherently input dependent compression process into an input independent process in terms of resulting mean bitrate.
A number of RC algorithms have been proposed in recent years. A good overview is, e.g., provided in [1]. Moreover, a modern implementation of a two-pass RC algorithm, augmenting VVenC, an open encoder for a novel and state-of-the-art video standard called Versatile Video Coding (VVC) ([2], is published at [3]. The latter approach, in particular, encodes at a constant “average” quantization parameter (QP) of QPbase=32 in the first RC pass and, using some frame-wise statistics collected in this first pass and a user specified target bitrate, allocates the final number of target bits for coding each frame in the second RC pass (along with estimates of the QP for each frame, derived from each frame's target bits). Any deviations from the allocated number of bits in the second pass (caused by coding with the estimated QPs and varying frame statistics) are balanced over time by continuously updating the RC's statistical model such that the RC model becomes more “reliable” over time. Other recent rate control related publications can be found at [4]-[6].
A simple way to assess the performance of a RC algorithm in a video encoder is to specify a target bitrate that can also be obtained by encoding without RC and a certain predefined QPbase (see “Known Technology” section). More specifically, encodings of several video sequences with, e.g., QPbase=22, 27, 32, 37 may be prepared, the resulting bitrates may be collected, and comparative RC encodings may then be requested with the collected sequence-wise “fixed-QP” bitrates employed as “variable-OP” RC target rates. The closer the RC encoding results match the above-noted corresponding fixed-QPbase. encodings in terms of objective metrics or subjective MOS data (see also the “Introduction” section), the better. It was observed, however, that in the known technology of [1] and [3]-[6], the RC encodings, although matching the fixed-QPbase encodings quite well in terms of bitrate, are inferior in PSNR, XPSNR [7], SSIM or MOS, thus indicating room for improvements.
Therefore, it is desired to provide concepts for rendering picture coding and/or video coding more efficient. The objective is to improve rate control algorithms, so that a peak signal-to-noise ratio (PSNR), an extended perceptually weighted peak signal-to-noise ratio (XPSNR), a structural similarity index measure (SSIM) and/or a mean opinion score (MOS) is improved. It is desired to reduce a bit stream and thus a signalization cost.
An embodiment may have an apparatus for encoding a video having a sequence of frames using rate control, configured to determine a global quantization parameter for the sequence of frames based on a target bit-rate; perform a coding pass, coding the sequence of frames, using the global quantization parameter by determining a frame quantization parameter per frame of the sequence of frames on the basis of the global quantization parameter, and subjecting the sequence of frames to R/D optimizing encoding by using, for each frame, the frame quantization parameter determined for the respective frame so as to obtain an encoded version of an associated coding size for the respective frame.
Another embodiment may have an apparatus for detecting a scene transition in a sequence of frames, configured to determine, for each frame of the sequence of frames, a visual activity measure; and detect the scene transition based on the visual activity measure.
According to another embodiment, a method for encoding a video having a sequence of frames using rate control may have the steps of: determining a global quantization parameter for the sequence of frames based on a target bit-rate; performing a coding pass, coding the sequence of frames, using the global quantization parameter by determining a frame quantization parameter per frame of the sequence of frames on the basis of the global quantization parameter, and subjecting the sequence of frames to R/D optimizing encoding by using, for each frame, the frame quantization parameter determined for the respective frame so as to obtain an encoded version of an associated coding size for the respective frame.
According to another embodiment, a method for detecting a scene transition in a sequence of frames may have the steps of: determining, for each frame of the sequence of frames, a visual activity measure; and detecting the scene transition based on the visual activity measure.
Still another embodiment may have a non-transitory digital storage medium having stored thereon a computer program for performing a method for encoding a video having a sequence of frames using rate control, having the steps of: determining a global quantization parameter for the sequence of frames based on a target bit-rate; performing a coding pass, coding the sequence of frames, using the global quantization parameter by determining a frame quantization parameter per frame of the sequence of frames on the basis of the global quantization parameter, and subjecting the sequence of frames to R/D optimizing encoding by using, for each frame, the frame quantization parameter determined for the respective frame so as to obtain an encoded version of an associated coding size for the respective frame, when the computer program is run by a computer.
Another embodiment may have a non-transitory digital storage medium having stored thereon a computer program for performing a method for detecting a scene transition in a sequence of frames, having the steps of: determining, for each frame of the sequence of frames, a visual activity measure; and detecting the scene transition based on the visual activity measure, when the computer program is run by a computer.
Another embodiment may have a data stream generated by an inventive apparatus as mentioned above.
In accordance with a first aspect of the present invention, the inventors of the present application realized that one problem encountered when trying to use Conventional RC systems stems from the fact that they match the fixed-QPbase encodings quite well in terms of bitrate, but are inferior in objective or subjective performance, like PSNR, XPSNR, SSIM or MOS. According to the first aspect of the present application, this difficulty is overcome by using an adaptive quantization parameter (OP). The inventors found, that it is advantageous to adapt an ‘average’ quantization parameter for an RC coding, e.g., dependent on a target bit-rate and/or dependent on dimensions, e.g., width and height, of the frames coded with rate control, instead of using a constant/fixed ‘average’ quantization parameter for all RC codings. The ‘average’ quantization parameter, i.e. a global quantization parameter, may represent an average of all frame wise quantization parameters of a sequence of frames. It is proposed to use an adaptive quantization parameter to respond to different requirement, e.g., in terms of a target bit rate and/or in terms of the dimensions of the frames to be coded, at an RC coding. This is based on the idea that besides of maintaining a very low functional complexity, a slight speedup in the overall runtime of an RC coding can be achieved with the adaptive quantization parameter. Furthermore, an improvement of an objective and/or subjective performance of RC coding can be achieved.
Accordingly, in accordance with a first aspect of the present application, an apparatus for encoding a video having a sequence of frames using rate control (RC) is configured to determine a global quantization parameter for the sequence of frames based on a target bit-rate. The target bit-rate may indicate the target number of bits per second to be consumed by the sequence of frames. Additionally, the apparatus is configured to perform a coding pass, e.g., a first coding pass, e.g., an analysis coding pass, coding the sequence of frames, using the global quantization parameter. The coding pass may be performed by determining a frame quantization parameter per frame of the sequence of frames on the basis of the global quantization parameter. The global quantization parameter, for example, represents an average of all frame quantization parameters of the sequence of frames. Optionally, the global quantization parameter may be determined for two or more sequences of frames. Additionally, the coding pass may be performed by subjecting the sequence of frames to R/D, i.e. rate/distortion, optimizing encoding, e.g. without RC, by using, for each frame, the frame quantization parameter determined for the respective frame so as to obtain an encoded version of an associated coding size for the respective frame. In case the coding pass is performed without rate control, the rate control may be performed in a further coding pass coding the sequence of frames. The further coding pass, e.g. a second coding pass, may follow the coding pass, e.g., a first coding pass.
In accordance with an aspect of the present invention, the inventors of the present application realized that one problem encountered when trying to use Conventional RC systems stems from the fact that they are often based on a so-called A-domain paradigm. According to this aspect of the present application, this difficulty is overcome by estimating parameters for a second coding pass based on parameters of a first coding pass. The inventors found, that highly relevant compression statistics can already be obtained in a first coding pass. Therefore, it is proposed to determine, for a given frame of a sequence of frames, a second-pass quantization parameter, e.g., a further quantization parameter or a further-pass quantization parameter, based on one or more quantization parameters collected in a different RC pass, e.g., in a previous coding pass like a first coding pass, for the sequence of frames. For example, it is advantageous to determine, for a given frame of a sequence of frames, a second-pass quantization parameter based on a quantization parameters collected for the same frame in a different RC pass. Optionally, it might be advantageous to additionally base the determination of the second-pass quantization parameter of the given frame on one or more quantization parameters collected during the same pass, i.e. the second pass, for previously coded frames of the sequence of frames. The concept of considering one or more parameters of a previous coding pass for a current coding pass is based on the idea that this achieves an improvement of an RC accuracy and, thereby, stability. Furthermore, this concept exhibits a lower model complexity and/or higher objective or subjective performance.
Accordingly, in accordance with this aspect of the present application, an apparatus, e.g., the apparatus of the first aspect, for encoding a video having a sequence of frames using rate control is configured to perform a coding pass and a further coding pass coding the sequence of frames. The coding pass may be performed by determining a frame quantization parameter per frame of the sequence of frames, and by subjecting the sequence of frames to R/D optimizing encoding by using, for each frame, the frame quantization parameter determined for the respective frame so as to obtain an encoded version of an associated coding size for the respective frame. The further coding pass is performed by determining, e.g., using the herein described fourth aspect for adapting a determination function, for each frame of the sequence of frames, a further frame quantization parameter based on the frame quantization parameter determined for the respective frame in the coding pass, and based on the coding size of the respective frame obtained by the coding pass. Additionally, the further coding pass is performed by subjecting the sequence of frames to a further R/D optimizing encoding by using, for each frame, the further frame quantization parameter determined for the respective frame, thereby obtaining a coded data stream having the video encoded thereinto. The apparatus may comprise any feature and/or functionality, which is described with one or more of the other herein described apparatuses for encoding a video using RC.
In accordance with a second aspect of the present invention, the inventors of the present application realized that one problem encountered when trying to use RC systems stems from the fact that they might be suitable for typical video material, but might not be suitable for boundary-case video sequences with unusual content statistics. According to the second aspect of the present application, this difficulty is overcome by considering a visual activity measure of a frame at a coding of the frame. The inventors found, that the visual activity measure may indicate efficiently and with high accuracy whether the respective frame has an unusual content statistics, i.e., whether the frame may be associated with a high level of camera noise or film grain, a strong fine or coarse texture, a highly irregular motion, and/or a chromatic aberration. It is proposed to determine a quantization parameter for the respective frame dependent on the visual activity measure. This is based on the idea that an accuracy of the quantization parameters used for coding the respective frame can be improved and that, thereby, a deviation in the bit consumption after a final coding of the respective frame from a target bit-count allocated by the RC to that particular frame may be reduced. The visual activity measure may improve a prediction of a quantization parameter of a frame, since it corrects deficiencies in the prediction/determination occurring at frames associated with unusual content statistics. Furthermore, an improvement of an objective and/or subjective performance of RC coding can be achieved.
Accordingly, in accordance with a second aspect of the present application, an apparatus for encoding a video having a sequence of frames using rate control is configured to perform a coding pass, e.g., a first coding pass or an analysis coding pass, coding the sequence of frames, by determining a frame quantization parameter per frame of the sequence of frames, and by subjecting the sequence of frames to R/D, i.e. rate/distortion, optimizing encoding by using, for each frame, the frame quantization parameter determined for the respective frame so as to obtain an encoded version of an associated coding size for the respective frame. Additionally, the apparatus is configured to perform a further coding pass by determining, e.g., using the herein described fourth aspect for adapting a determination function, for each frame of the sequence of frames, dependent on a visual activity measure of the respective frame, a further frame quantization parameter. The determination of the further frame quantization parameter is performed based on the frame quantization parameter determined for the respective frame, e.g., in the coding pass, and based on the coding size of the respective frame obtained by the coding pass. Additionally, the apparatus is configured to perform the further coding pass by subjecting the sequence of frames to a further R/D optimizing encoding, e.g., with single loop RC which varies block QP, e.g., a delta QP; or a log rate variation along the sequentially coded frames so as to adapt the target bitrate (BR), by using, for each frame, the further frame quantization parameter determined for the respective frame, thereby obtaining a coded data stream having the video encoded thereinto.
In accordance with a third aspect of the present invention, the inventors of the present application realized that one problem encountered when trying to use RC systems stems from the fact that a typical video sequence contains relatively frequent scene changes or camera switches after which the local characteristics of the sequence usually change. According to the third aspect of the present application, this difficulty is overcome by detecting a scene transition, like a scene change or camera switch, in a sequence of frames. The inventors found, that the visual activity measure varies between two consecutive frames at a scene transition. Therefore, a scene transition can be detected efficiently and with great accuracy based on the visual activity measure. A correct detection of local changes of video content characteristics can improve a control of a coding quality and of a bitrate. Furthermore, an improvement of an objective and/or subjective performance of RC coding can be achieved, if the detection of the scene transition is implemented in one of the herein discussed apparatuses for encoding a video using RC.
Accordingly, in accordance with a third aspect of the present application, an apparatus for detecting a scene transition, e.g., a scene change or a camera switch, in a sequence of frames is configured to determine, for each frame of the sequence of frames, a visual activity measure and detect the scene transition based on the visual activity measure.
In accordance with a fourth aspect of the present invention, the inventors of the present application realized that one problem encountered when trying to use RC systems stems from the fact that a typical video sequence contains relatively frequent scene changes or camera switches after which the local characteristics of the sequence usually change. According to the fourth aspect of the present application, this difficulty is overcome by adapting a rate control parameter for each scene individually. The inventors found, that statistical models which were used in previous scenes may no longer be valid for a new scene. It is proposed to set the rate control parameter to a predetermined setting for a first frame to be encoded of a scene and adapt the rate control parameter for further frames to be encoded of the same scene, e.g., dependent on a temporal hierarchy level they are associated with. This is based on the idea that it is more efficient to consider and adapt the rate control parameter for each scene of the video individually, instead of using a rate control parameter associated with another scene of the video and gradually adapting same to the local characteristics of the new scene of the video. Furthermore, an improvement of an overall encoding performance and/or an objective performance and/or a subjective performance of the RC coding can be achieved.
Accordingly, in accordance with a fourth aspect of the present application, an apparatus for encoding a video using rate control is configured to detect scene transitions between a number of scenes in a sequence of frames, so that each frame of the sequence of frames is associated with a scene of the number of scenes, e.g., using the scene detection of the third aspect. Additionally, the apparatus is configured to encode the sequence of frames using rate control by adapting, separately for each scene, a rate control parameter for a frame of the respective scene depending on a characteristic of an encoded version of frames which precede the frame in coding order and are associated with the respective scene, and by setting, for each scene, the rate control parameter to a predetermined setting for a firstly encountered frame of the respective scene, e.g., firstly encountered in encoding order. For example, the predetermined setting might be a default setting, or might be an estimated setting estimated based on one or more frames associated with the respective scene such as based on an analysis of the visual activity measure of these frames (frames of the same scene, e.g., following the respective frame in encoding order); the predetermined setting might be determined depending on a frame coding type and/or a temporal hierarchy level, e.g., temporal layer, of the respective frame.
Embodiments are related to methods, which are based on the same considerations as the above-described apparatuses. The methods can, by the way, be completed with all features and functionalities, which are also described with regard to the apparatuses.
An embodiment is related to a data stream having a picture or a video encoded thereinto by an apparatus for encoding a video. Another embodiment is related to a data stream having a picture or a video encoded thereinto using a herein described method for encoding the video.
An embodiment is related to a computer program having a program code for performing, when running on a computer, a herein described method, when being executed on the computer.
The drawings are not necessarily to scale, emphasis instead generally being placed upon illustrating the principles of the invention. In the following description, various embodiments of the invention are described with reference to the following drawings, in which:
Equal or equivalent elements or elements with equal or equivalent functionality are denoted in the following description by equal or equivalent reference numerals even if occurring in different figures.
In the following description, a plurality of details is set forth to provide a more throughout explanation of embodiments of the present invention. However, it will be apparent to those skilled in the art that embodiments of the present invention may be practiced without these specific details. In other instances, well-known structures and devices are shown in block diagram form rather than in detail in order to avoid obscuring embodiments of the present invention. In addition, features of the different embodiments described herein after may be combined with each other, unless specifically noted otherwise.
In the following, various examples are described which may assist in achieving a more effective compression and an improved encoding performance when using RC encoding. The RC encoding may be added to other encoding modes heuristically designed, for instance, or may be provided exclusively.
In order to ease the understanding of the following examples of the present application, the description starts with a presentation of possible encoders and decoders fitting thereto into which the subsequently outlined examples of the present application could be built.
As mentioned, encoder 14 performs the encoding in a block-wise manner or block-base. To this, encoder 14 subdivides picture 10 into blocks, units of which encoder 14 encodes picture 10 into datastream 12. Generally, the subdivision may end-up into blocks 18 of constant size such as an array of blocks arranged in rows and columns or into blocks 18 of different block sizes such as by use of a hierarchical multi-tree subdivisioning with starting the multi-tree subdivisioning from the whole picture area of picture 10 or from a pre-partitioning of picture 10 into an array of tree blocks wherein these examples shall not be treated as excluding other possible ways of subdivisioning picture 10 into blocks 18.
Further, encoder 14 is a predictive encoder configured to predictively encode picture 10 into datastream 12. For a certain block 18 this means that encoder 14 determines a prediction signal for block 18 and encodes the prediction residual, i.e. the prediction error at which the prediction signal deviates from the actual picture content within block 18, into datastream 12.
Encoder 14 may support different prediction modes so as to derive the prediction signal for a certain block 18. The prediction modes comprise intra-prediction modes according to which the inner of block 18 is predicted spatially from neighboring, already encoded samples of picture 10. The encoding of picture 10 into datastream 12 and, accordingly, the corresponding decoding procedure, may be based on a certain coding order 20 defined among blocks 18. For instance, the coding order 20 may traverse blocks 18 in a raster scan order such as row-wise from top to bottom with traversing each row from left to right, for instance. In case of hierarchical multi-tree based subdivisioning, raster scan ordering may be applied within each hierarchy level, wherein a depth-first traversal order may be applied, i.e. leaf notes within a block of a certain hierarchy level may precede blocks of the same hierarchy level having the same parent block according to coding order 20. Depending on the coding order 20, neighboring, already encoded samples of a block 18 may be located usually at one or more sides of block 18. For instance, neighboring, already encoded samples of a block 18 are located to the top of, and to the left of block 18.
Intra-prediction modes may not be the only ones supported by encoder 14. In case of encoder 14 being a video encoder, for instance, encoder 14 may also support intra-prediction modes according to which a block 18 is temporarily predicted from a previously encoded picture of video 16. Such an intra-prediction mode may be a motion-compensated prediction mode according to which a motion vector is signaled for such a block 18 indicating a relative spatial offset of the portion from which the prediction signal of block 18 is to be derived as a copy. Additionally or alternatively, other non-intra-prediction modes may be available as well such as inter-view prediction modes in case of encoder 14 being a multi-view encoder, or non-predictive modes according to which the inner of block 18 is coded as is, i.e. without any prediction.
As already mentioned above, encoder 14 operates block-based. For the subsequent description, the block bases of interest is the one subdividing picture 10 into blocks for which the intra-prediction mode is selected out of a set or plurality of intra-prediction modes supported by predictor 44 or encoder 14, respectively, and the selected intra-prediction mode performed individually. Other sorts of blocks into which picture 10 is subdivided may, however, exist as well. For instance, the above-mentioned decision whether picture 10 is inter-coded or intra-coded may be done at a granularity or in units of blocks deviating from blocks 18. For instance, the inter/intra mode decision may be performed at a level of coding blocks into which picture 10 is subdivided, and each coding block is subdivided into prediction blocks. Prediction blocks with encoding blocks for which it has been decided that intra-prediction is used, are each subdivided to an intra-prediction mode decision. To this, for each of these prediction blocks, it is decided as to which supported intra-prediction mode should be used for the respective prediction block. These prediction blocks will form blocks 18 which are of interest here. Prediction blocks within coding blocks associated with inter-prediction would be treated differently by predictor 44. They would be inter-predicted from reference pictures by determining a motion vector and copying the prediction signal for this block from a location in the reference picture pointed to by the motion vector. Another block subdivisioning pertains the subdivisioning into transform blocks at units of which the transformations by transformer 32 and inverse transformer 40 are performed. Transformed blocks may, for instance, be the result of further subdivisioning coding blocks. Naturally, the examples set out herein should not be treated as being limiting and other examples exist as well. For the sake of completeness only, it is noted that the subdivisioning into coding blocks may, for instance, use multi-tree subdivisioning, and prediction blocks and/or transform blocks may be obtained by further subdividing coding blocks using multi-tree subdivisioning, as well.
A decoder 54 or apparatus for block-wise decoding fitting to the encoder 14 of
Again, with respect to
The RC encoding described in the following, see
The present invention proposes four aspects to improve the performance of existing RC approaches like [3]:
Each of these aspects will be described, using figures where appropriate, in a separate subsection hereafter.
Aspect 1: Alternative OP/Lambda-from-Rate Estimation
Conventional RC systems, including two-pass variants, are often based on a so-called λ-domain paradigm [1], [8], [9] (λ: Lagrange parameter), initially devised for frame or block-level RC applications. With two-pass RC methods, however, it is possible to obtain highly relevant compression statistics already in a first analysis coding pass, usually configured for faster runtime than the second final coding pass (see e.g. [3]) making it possible to reach objectively (in e.g. PSNR) or subjectively (in terms of visual coding quality) better RC results.
The apparatus 100 may be configured to determine a global quantization parameter 112, e.g., QP1, for the sequence of frames 101 to 10n based on a target bit-rate 114, e.g., BRT. For example, the determination may be performed using a quantization parameter determination means 110, i.e. a OP determinator. The target bit rate 114 may be chosen/provided by a user of the apparatus 100. It is only optional that the global quantization parameter 112 is adapted/determined based on input information. Alternatively, it is also possible that the apparatus 100 uses a preset/predetermined global quantization parameter 112 instead of determining same.
According to an embodiment, the apparatus 100 may consider additionally to the target bit-rate 114 also dimensions, like a width and/or height, of the frames 10 of the sequence of frames 101 to 104 at the determination of the global quantization parameter 112. The dimensions of the frames 10 may be measured in pixels.
According to an embodiment, the apparatus 100 is configured to perform, for each frame 10 of the sequence of frames 101 to 10n, the determining of the frame quantization parameter 122 for each frame of the sequence of frames 101 to 10n on the basis of the global quantization parameter 112 depending on a frame coding type and/or a temporal hierarchy level of the respective frame.
The apparatus 100 is configured to perform a coding pass 120, e.g., a first coding pass or an analysis coding pass, coding the sequence of frames 101 to 10n, using the global quantization parameter 112. The coding pass 120 is performed by determining a frame quantization parameter 122, e.g., OP′1, per frame 10 of the sequence of frames 101 to 10n on the basis of the global quantization parameter 112, e.g., using a frame quantization parameter determination means 121, i.e. a frame OP determinator. Thus, each frame 10 is associated with a frame quantization parameter 122, with which the respective frame 10 can be quantized by the apparatus 100. The global quantization parameter 112 may represent an average of all frame quantization parameters 122 associated with the sequence of frames 101 to 10n. Additionally, the coding pass 120 is performed by subjecting the sequence of frames 101 to 10n to a rate/distortion (R/D) optimizing encoding 124, e.g. without rate control, by using, for each frame 10, the frame quantization parameter 122 determined for the respective frame 10 so as to obtain an encoded version 126 of an associated coding size 127, e.g., B′1, for the respective frame 10. The coding size may indicate a number of bits occupied by the encoded version 126 of the respective frame 10, e.g., after the R/D optimized encoding 124.
The apparatus 100 may be configured to perform the optional further coding pass 130 by determining, for each frame 10 of the sequence of frames 101 to 10n, the further frame quantization parameter 132, e.g., QPf, based on the frame quantization parameter 122 determined for the respective frame 10 in the coding pass 120, and based on the coding size 127 of the respective frame 10 obtained by the coding pass 120. The determination of the further frame quantization parameter may be performed using a further frame quantization parameter determination means 131, i.e. a further frame QP determinator.
The optional further coding pass 130 may be performed with rate control, for example, allocating a particular number of bits, e.g., Bf, to each frame 10 and estimating therefrom, for each frame 10, coding parameters, like the further frame quantization parameter 132 and/or a Lagrange parameter, for coding the respective frame 10 in the further coding pass 130. The particular number of bits, e.g., Bf, may represent a target coding size of the respective frame. The target coding size may be associated with the target bit rate 114 or the target coding size may be derived from the target bit rate 114. The apparatus may be configured to estimate in the further coding pass 130 the further frame quantization parameter 132 for the respective frame based on the number of bits allocated to the respective frame and, for example, improve or correct this estimate based on the coding size 127 and the frame quantization parameter 122 obtained in the coding pass 120 for the respective frame. The coding size 127 of the respective frame 10 obtained by the coding pass 120 can provide a good estimate on the number of bits needed to encode the respective frame and therefore improve the allocation of the number of bits to the respective frame 10 rendering the rate control more efficient. A relationship between the coding size 127 obtained for the respective frame in the coding pass 120 and a number of bits allocated to the respective frame in the further coding pass 130 may be considered in the determination of the further frame quantization parameter 132 to improve an accuracy of the further frame quantization parameter 132.
The apparatus 100 may allocate, for each frame of the sequence of frames 101 to 10n, the target coding size to the respective frame 10, for example, on the basis of the target bit rate 114, e.g., the target bit rate 114 is corrected based on the coding size 127 of the respective frame (obtained at the coding pass). The apparatus 100 might be configured to perform the allocating by determining a total target coding size BT for the sequence of frames 101 to 10n based on the target bit rate BRT, e.g., BT-BRT*{number of frames of the sequence of frames}/{Frames per second}), determine a sum B1 over the coding sizes 127 of all frames of the sequence of frames 101 to 10n, e.g., B1=sumf(B′f)), and determine the target coding size Bf for the respective frame 10 based on the total target coding size BT for the sequence of frames 101 to 10n, the sum B1 and the coding size B′f 127 of the respective frame 10 obtained at the first pass 120. For example, the target coding size Bf is derivable according to Bf=B′f* BT/B1, wherein B′f is the coding size 127 of the respective frame obtained by the coding pass 210, B1=sumf(B′f) and BT-BRT*{number of frames (e.g., of the sequence of frames or of all frames of the video)}/{Frames per second} with BRT being the target bit rate 114.
According to an embodiment, the target coding size, e.g., Bf, allocated to the respective frame corresponds to a number of bits, e.g., a bit-count, allocated to the respective frame, e.g., as a target bit consumption for an encoded version 136 of the respective frame obtainable in the further coding pass 130.
The apparatus 100 may be configured to perform the optional further coding pass 130 by subjecting the sequence of frames 101 to 10n to a further R/D optimizing encoding 134 by using, for each frame 10, the further frame quantization parameter 132 determined for the respective frame 10, thereby obtaining a coded data stream 12 having the video 16 encoded thereinto.
According to an embodiment, the further R/D optimizing encoding 134 can be performed with single loop RC which varies block QP, e.g., delta QP, so that each frame is encoded according to the target bit rate 114. The respective frame 10 may be divided into blocks 18 and the apparatus 100 may be configured to determine, for each block 18, a respective block quantization parameter based on the further frame quantization parameter 132. The further frame quantization parameter may indicate a set of block quantization parameters, out of which the respective block quantization parameter for the respective block 18 is selected by the apparatus 100. The further frame quantization parameter 132 may represent a frame global quantization parameter. The further frame quantization parameter 132 may represent an average of the set of block quantization parameters. The data stream 12 may comprise the further frame quantization parameter 132 and may indicated for each block 18 the respective block quantization parameter relative to the further frame quantization parameter 132, e.g., as a delta quantization parameter.
According to an alternative embodiment, the further R/D optimizing encoding 134 can be performed by performing a log rate variation along the sequentially coded frames so as to adapt the target bitrate, i.e. a target frame coding size. This might be performed dependent on the obtained coding sizes of the already encoded frames in the further R/D optimizing encoding 134 so that the target bit rate 114 is achieved for the sequence of frames 101 to 10n.
According to an embodiment, the apparatus 100 may determine, for each frame, the respective further frame quantization parameter 132 dependent on a visual activity measure 140 associated with the respective frame 10, as will be described in more detail with regard to
Summary of Inventive Aspect 1
An alternative to the A-domain and, possibly, other published RC statistical models, exhibiting lower model complexity and/or higher objective or subjective performance, can be described in 2 paragraphs as follows.
First, contrary to known technology such as, e.g., [3], where coding of a video sequence in the first RC pass of a multi-pass RC scheme is performed with a fixed predetermined global quantization parameter (QP), it is proposed to utilize a video input adaptive first-pass global QP1 112 whose value is determined based on the user specified target bit-rate BRT 114 (e.g. in bits per second) and/or video dimensions, e.g., W·H (product of video width and height). The advantage of this approach, besides maintaining a very low functional complexity, is a slight speedup in the overall runtime of said multi-pass RC scheme, measured as the sum of the runtimes of all RC passes.
Second, having made use of an improved signal adaptive determination of the first-pass QP1 value, i.e. the global quantization parameter 112, it is also possible to improve the estimation of “slice” QPf, i.e. the further frame quantization parameter 132, and/or λf values, e.g., further frame Lagrange parameter, used for coding of a specific frame f 10 in the second RC pass, i.e. the further coding pass 130, from an associated bit-count. Generally speaking, the RC method determines and allocates a particular number of bits, or bit-count Bf, to each frame to be compressed in the final RC pass (i.e., actual coding run 120 in single-pass RC or second coding run 130 in two-pass RC) and then estimates QPf and λf from Bf such that, ideally, the bit consumption resulting from coding the frame 10 with these parameters matches Bf, i.e. the target number of bits, i.e. the coding size, for the respective frame. Instead of employing only indirect statistics as in the known technology to improve the accuracy of such “QPf, λf from Bf” estimates, improving only relatively slowly over time (i.e. as more frames are encoded), it is proposed to make direct use of the 1st-pass frame-wise QP′f, 122, λ′f, and/or associated resulting B′f values, coding size 127, (which depend solely on QP1 112 and some other constant encoder configuration parameters) to estimate QPf 132 and/or λf In other words, it is proposed to determine, for a given frame f 10, QPf, 132 and/or λf not only from QP, λ results collected while coding previous frames 10 of the video sequence 16 in the same RC pass, e.g., in the further coding pass 130 in
The improved input/BRT adaptive derivation of the first-pass “average” QP1, i.e. the global quantization parameter 112, is advantageously realized as follows, i.e. with a determination function:
QP
1=max(QPmin,QPmax−round(√{square root over (d·BRT/1000000)}), (1)
where BRr, i.e. the target bit rate 114, is the user specified target number of bits per second (thus converted to Mbit/s in the equation), QPmin=17 or 7 and QPmax=40 are developer specified constants (lower and upper OP limits which may vary with the compression technology, e.g., QPmin, may indicate a minimum quantization parameter and QPmax may indicated a maximum quantization parameter), and d=(3840·2160)/(W·H) is the video input dimension dependency of the inventive method. Note that all numerical values above were chosen for WC [2] and their choice may vary.
In equation (1) above, the term d/1000000 may be regarded as a constant, e.g., z, since the width and height of the frames 10 of the sequence of frames 101 to 10n does normally not change within the sequence. But the constant may have a different value for a different sequence of frames with frames of another dimension. In other words, the constant z may represent a ratio of a dimension parameter d, which is dependent on the width and the height of the frames of the sequence of frames, divided by 1000000.
The alternative “QPf, λf from Bf” estimate is, advantageously, realized, e.g., with a further determination function, using the associated different-pass data as
QP
f=Clip3(QPmin,QPmax,round(QP′f−a·log2(Bf/B′f))) (2)
or
QP
f=Clip3(QPmin,QPmax,round(QP′f+a·log2(B′f/Bf))), (3)
where limits QPmin and QPmax may, but don't need to, be chosen as above, and Clip3 ( . . . ) enforces the range QPmin≤QPf≤QPmax on the “slice” QP value 132 used for final coding of the given frame f 10. It was found that, for random-access coding, a=0.1367. QP′f, is a good choice which, however, is coding technology dependent. The further determination function, i.e., equation (2) or (3), depends on a deviation, e.g., a ratio, between the target coding size, e.g. Bf, of the respective frame and the coding size 127, e.g., B′f,
The term Bf/B′f may represent a deviation, e.g., a ratio, between a target coding size Bf of the respective frame 10 and the coding size B′f 127 of the respective frame 10 obtained by the coding pass 120. The further frame quantization parameter 132 for the respective frame 10 may be determined dependent on the respective deviation such that the respective further frame quantization parameter 132 is associated with a coarser quantization than the respective frame quantization parameter 122 in case of the respective coding size 127 being larger than the respective target coding size Bf and the respective further frame quantization parameter 132 is associated with a finer quantization than the respective frame quantization parameter 122 in case of the respective coding size 127 being smaller than the target coding size Bf.
According to an embodiment, the apparatus 100 may be configured to adapt the further determination function, e.g., during the further coding pass 130, according to a previously encoded frame or according to previously encoded frames, e.g., such as via cf adaptation, as will be described in more detail under aspect two in the following. The cf adaptation may be performed depending on a ratio between an even further coding size, e.g., B″f, resulting from the further RD optimizing encoding 134 of the respective frame on the one hand and the target coding size, e.g., Bf, or the coding size 127, e.g., B′f, of the respective frame on the other hand.
According to an embodiment, the apparatus 100 may be configured to determine during the coding pass 120 additionally a frame Lagrange parameter A′f for each frame of the sequence of frames 101 to 10n on the basis of the global quantization parameter 112 or on the basis of the respective frame quantization parameter 122 determined for the respective frame. The apparatus 100 may be configured to perform the R/D optimizing encoding 124 of the respective frame 10 further using the respective frame Lagrange parameter λ′f determined for the respective frame.
According to an embodiment, the apparatus 100 may be configured to determine during the further coding pass 130 a further frame Lagrange parameter λf for each frame of the sequence of frames 101 to 10n based on the respective further frame quantization parameter 132 determined for the respective frame 10, and to perform the further R/D optimizing encoding 134 of the respective frame 10 further using the respective further frame Lagrange parameter λf determined for the respective frame 10. Optionally, the apparatus 100 may additionally to the respective further frame quantization parameter 132 also consider the respective frame quantization parameter 122 and the respective Lagrange parameter λ′f determined for the respective frame 10 at the determination of the respective further frame Lagrange parameter λf.
The further frame Lagrange parameter λf can be obtained from same-pass QPf 132 and different-pass QP′f 122 and λ′f as follows, e.g., see [12]:
λf=λ′f·2(QPf−QP′If)/3. (4)
Note that the above advantageous embodiment may be combined with prior-art temporal RC model updates or improvements using the coding results of previous same-pass frames as already indicated earlier [1], [8], [9]. Furthermore, the base-2 logarithm log2 in (2) and/or (3) may easily be replaceable by a logarithm with a different base (e.g., 10) as long as the numerical constant in a (here 0.1367) is adapted accordingly (e.g., 0.454 with log10).
Aspect 2: Perceptually Motivated Improvement of Alternative Estimation
The “QP, λ from target bit-count” estimator, e.g., the apparatus 100 shown in
The apparatus 100 shown in
According to an embodiment, the apparatus 100 may be configured to determine, for each frame 10 of the sequence of frames 101 to 10n, the respective visual activity measure 140, e.g., using visual activity measure determination means or visual activity measure determinator.
Furthermore, the apparatus 100 is configured to perform the further coding pass 130 by subjecting the sequence of frames 101 to 10n to a further R/D optimizing encoding 134 by using, for each frame 10, the further frame quantization parameter 132 determined for the respective frame 10, thereby obtaining a coded data stream 12 having the video 16 encoded thereinto.
The apparatus 100 shown in
According to an embodiment, the further coding pass 130 is performed with RC. The apparatus 100 may be configured to allocate, for each frame 10 of the sequence of frames 101 to 10n, a target coding size, e.g. a number of bits Bf or a bit count, to the respective frame, and perform, for each frame 10 of the sequence of frames 101 10n, the determining of the further frame quantization parameter 132, e.g., QPf, further based on the target coding size of the respective frame 10. The target coding size, e.g., is associated with a target bit rate 114 for the sequence of frames 101 to 10n. For example, the target coding size may represent a target bit consumption for an encoded version 136 of the respective frame 10 obtainable in the further coding pass 130.
According to an embodiment, the apparatus 100 is configured to correct, for each frame 10 of the sequence of frames 101 to 10n, the respective further frame quantization parameter 132 dependent on the visual activity measure 140 of the respective frame 10. The visual activity measure 140 may indicate how much the further frame quantization parameter 132 has to be corrected. For example, the higher the visual activity measure 140 is, the lower is a correction amount. The visual activity measure 140 may be configured to correct an underestimation or an overestimation of a bit-consumption of the respective frame. This may be entailed, if the apparatus allocates a target coding size to the respective frame, which target coding size will not be reached at the further R/D optimizing encoding of the respective frame. The further frame quantization parameter 132, for example, is modified by a correction amount to result into a coarser quantization parameter in case of a further coding size of the respective frame being larger than the target coding size, and to result into a finer quantization in case of the further coding size of the respective frame being lower than the target coding size. The further coding size of the respective frame 10 may indicate a number of bits occupied by the encoded version 136 of the respective frame 10, e.g., after the further R/D optimized encoding 134. The further frame quantization parameter 132, for example, is modified with the parameter a′ described in equations (6) and (7) in the following. The above described correction may result in corrected versions of the further frame quantization parameter 132. The further frame quantization parameter determination means 131 may be configured to provide the corrected version of the further frame quantization parameter 132 as the further frame quantization parameter 132.
Summary of Inventive Aspect 2
In the following, an extension to the above QP/λ estimator, e.g., the apparatus 100 shown in
A possible corner-case inaccuracy of the “QP, λ from target rate” estimator 100 may manifest itself in two forms:
It was found empirically that the luma-component visual activity measure 140, determined in the low-complexity XPSNR model of the human visual system (details of which are published in [7] and [10]-[12]) and given by
with sf=frame samples, Pf=luma-component input picture of frame f 10, and all other parameters as given in [7], [11], serves as a good indicator of the possibility and particular form of said corner-case inaccuracy. The variable sf may represent a motion picture signal and may be associated with the input Pf. The character amin is a minimal visual activity measure, W is a width, e.g., in pixel, of the respective frame f, H is a height, e.g., in pixel, of the respective frame f, [x, y] are horizontal and vertical sample coordinates. Note that the spatial high-pass filtering towards the calculation of the visual activity value VAY 140 is indicated by the hs signal while the temporal high-pass filtering part of the visual activity 140 is indicated by the hf signal.
With VAY 140, it is possible to improve the prediction accuracy of eq. (2) or (3) simply by augmenting the definition of its parameter a to include a dependency on equation (5). In other words, it is proposed to make equations (2) or (3) depend on VAY 140.
Extending equation (2) and/or (3) by a dependency on VAY 140 is, advantageously, done as follows, with constant μVA=mean VAY value:
QP
f=Clip3(QPmin,QPmax,round(QP′f−a′·log2(Bf/B′f)))with a′=(0.1367±(VAY
or
QP
f=Clip3(QPmin,QPmaxround(QP′f−a′·log2(Bf/Bf))),with a′=(0.1367±(μVA−VAY
In other words, parameter a in equation (2) and/or (2) is replaced by a visual activity 140 dependent a′, calculated as the sum of, or difference between, a of (2) or (3), e.g. with value 0.1367·QP′f as above, and some further parameter dependent on VAY
Note that a correction factor c can be included as well. This factor is initially equal to 1 but may be adapted (except during a scene/camera switch, where it may be updated to 1, see below), e.g. as follows:
C
f+frameDistance(tempLevel)=Clip3(1/4,4, cf·(Bf/B″f)b),b<1and,advantageously,b=1/6,B′f, bits
after 2nd pass. Adaptation of c is advantageously done separately for each temporal layer 210 tempLevel (one c per tempLevel).
The apparatus 100 may be configured to perform, for each frame 10 of the sequence of frames 101 to 10n, the determining of the further frame quantization parameter 132 according to the further determination function, i.e. equation 6 or 7, of the deviation between the target coding size Bf of the respective frame and the coding size 127, i.e. B′f, and to adapt the further determination function for the respective frame according to a previously encoded frame, e.g., using the correction factor c. The previously encoded frame may correspond to the same temporal hierarchy level as the respective frame, i.e. the frame currently to be encoded.
As can be seen in the equation above, the correction factor c may depend on a deviation between an even further coding size, i.e. B″f, resulting from the further RD optimizing encoding 134 for the previously encoded frame on the one hand and the target coding size Bf or the coding size 127, i.e. B′f, of the previously encoded frame on the other hand. Therefore, the apparatus 100 may be configured to adapt the further determination function for the respective frame dependent on this deviation.
Explanation of cf
Aspect 2 may include the definition of a frame-wise correction factor cf, which is used to adjust the parameter a, or a′, over time based on previously coded frames. For best adjustment performance, these previously coded frames are restricted to belong to
C
f+frameDistance(tempLevel)=Clip3(1/4,4,cf·(Bf/B″f)b)with 0<b<1.
In other words, given a frame f associated with tempLevel (i.e., belonging to the temporal hierarchy layer 210 defined by tempLevel), a previously defined correction factor cf associated with f, and data obtained by analysis and/or encoding (here, using c) of the motion picture associated with f (here, Bf and B″f as defined before), the above equation allows to derive the correction factor to be used during encoding of the next frame associated with tempLevel. To conclude, if we further enforce that correction factor c is reset, e.g. to a value of 1, before encoding the first frame, associated with tempLevel, at or after a scene change/camera switch, said equation specifies a “same temporal level” and “same scene” constrained correction factor, as desired.
Aspect 3: Improved Detection of Scene Changes or Camera Switches
A typical video sequence 16 (e.g. a movie) contains relatively frequent scene changes or camera switches, i.e. scene transitions 152, after which the local characteristics of the sequence usually change. Differences in the content throughout the video sequence 16 can affect the encoding performance, especially if RC is being used. Hence, it is crucial for a practical video encoder to correctly detect 150 the local changes of video content characteristics and take the appropriate measures to control the coding quality and bitrate. A robust such detector is outlined hereafter.
According to an embodiment, the apparatus 200 is configured to detect the scene transition 152 at a current frame, e.g., f, based on the visual activity measure 140, e.g., VAY
According to an embodiment, one of the herein described apparatuses 100, see
Summary of Inventive Aspect 3
When an abrupt scene change or camera switch, i.e. a scene transition 152, occurs in a video sequence, the values of the visual activity 140 described earlier for two consecutive frames 105 and 106 (in display order) would typically have notably different values. Let us consider an example where an abrupt scene change 152 happens at frame f, e.g., 106. As indicated by ht in equation (5), to calculate the visual activity 140 of frame f, e.g., 106, frame f−1, e.g., 105, from a different scene is used. However, when calculating the visual activity 140 of frame f−1, e.g., 105, frame f−2, e.g., 104, is used, which belongs to the same scene as frame f−1, e.g., 105. For these reasons, it is expected that VAY
The ratio rVA may represent a quotient of a division between a dividend derived by the visual activity measure 140, e.g., VAY
It should be noted that, by definition, VAY
In the advantageous realization, a new scene is detected whenever the following relation is evaluated and found to be true, where the evaluation is performed before final encoding of frame f (and writeout to a bitstream):
VA
Y
=t·VA
Y
(9)
Furthermore, potential detection of false positives may be suppressed by introducing a minimum duration m>1 (in units of frames) of each scene, i.e. a minimum distance between two detected scene or camera changes, i.e., between two scene transitions 152. This value can be fixed (e.g. m=8) or it can depend on the sequence's frame rate (e.g. m=round (fps/4), where fps indicates the number of frames per second). More specifically, when a scene is detected at frame f and, subsequently, evaluation of (6) would cause a new scene to be detected at frame f+i, with 0<i<m, detection of the latter may be suppressed and subsequent scenes may only start at frames i≥m. In other words, a scene is associated with a predetermined duration, and the apparatus 200 may be configured to suppress a detection of a scene transition 152 at a current frame, if the current frame occurs during the predetermined duration. The predetermined number may represent a minimum number, e.g., m, of frames.
Aspect 4: Improved Update of RC Statistical Model after Scene Changes/Camera Switches
In many RC methods, statistical models are used to formulate the relationship between the encoding parameters, e.g., rate control parameters 162, and the actual bitrate [1], [8], [9]. These statistical models are typically updated during the encoding to better adapt to the local characteristics of the encoded sequence 16. However, with the presence of scene changes or camera switches, i.e. scene transitions 152, in the sequence 16, the statistical models which were used in previous scenes 154 may no longer be valid for the new scene 154. Hence, to improve the overall encoding performance, the RC statistical models and their corresponding parameters, e.g., rate control parameters 162, should be updated when the new scene 154 is detected.
The apparatus 100 is configured to detect scene transitions 152, e.g, 1521 to 1523, between a number of scenes 154, e.g., 1541 to 1544, in a sequence 16 of frames, so that each frame 10 of the sequence 16 of frames is associated with a scene 154 of the number of scenes 154.
For this detection the apparatus 100 might comprise the apparatus 200 of
Additionally, the apparatus 100 is configured to encode the sequence 16 of frames using rate control by adapting, separately for each scene 154, a rate control parameter 162, e.g., QPf, a, a′, λ, α or β, for a frame 10 of the respective scene 154 depending on a characteristic of an encoded version 136 of frames which precede the frame in coding order and are associated with the respective scene 154. The characteristic of an encoded version of a frame might represent encoding parameters, like rate control parameters, used for obtaining the encoded version or a visual activity measure determined for the respective frame or output statistics of the encoded version of the frame, like a coding size or its deviation from a target size, e.g., from an allocated number of bits. The adaptation of the rate control parameter may be performed using a rate control parameter adaptation means 160.
Additionally, the apparatus 100 is configured to encode the sequence 16 of frames using rate control by setting, for each scene 154, the rate control parameter 162 to a predetermined setting for a firstly encountered frame of the respective scene 154, e.g., firstly encountered in encoding order. The predetermined setting might be a default setting, or might be an estimated setting estimated based on one or more frames associated with the respective scene such as based on an analysis of the visual activity measure 140 of these frames (frames of the same scene, e.g., following the respective frame in encoding order). The predetermined setting might be determined depending on a frame coding type and/or a temporal hierarchy level 210, e.g., a temporal layer, of the respective frame. The apparatus 100 may be configured to derive the predetermined setting by evaluating one or more frames of the respective scene 154, which one or more frames are already available but not yet encoded, so as to obtain an evaluation result, and by deriving the predetermined setting based on the evaluation result.
According to an embodiment, the apparatus 100 may be configured to perform a coding pass 120 and a further coding pass 130, e.g., as described with regard to
According to an embodiment, the further frame encoding parameter 132 may be determined based on a visual activity measure 140, e.g., as described for the further frame quantization parameter 132 in
Summary of the Inventive Aspect 4
Many RC algorithms use the RC statistical models to select the encoding parameters, such as QP 122, 132 or Lagrange parameter λ, to meet the target rate 114. One of the most commonly used models in RC methods for HEVC and VVC is the R−λ model [5], [8]. In this model, λ is calculated from the target rate 114 (bits per pixel, bpp) using a hyperbolic model:
λ=α·bppβ, (10)
To update the RC statistical model and the corresponding parameters 162 for the new scene 154, necessary data needs to be gathered by means of analysis or encoding. Once the necessary data is collected, said statistical model can be updated for said newly detected scene 154. Updating the RC statistical model after a detected scene change or camera switch 152 is relatively straightforward for the configurations with identical encoding and the display order of the frames 10 in a sequence 16. When encoding a sequence 16 in a configuration where the encoding order 300 is different from the display order 310 of frames 10, such as random-access configuration, some frames 10 may be encoded before the frames which precede them in display order 310. An example for a difference in encoding 300 and display 310 order for one Group of Pictures (GOP) of size 16 is shown in
If a hierarchical coding structure is used, different frames 10 can belong to different temporal levels 210 (or layers) as, for example, shown in
For example, in display order, for the sequence shown in
To decide the frame location (index) at which the RC statistical model parameters have to be updated, frames 10 should be analyzed (evaluated) in display order 310. The first frame of each temporal level occurring after the frame where the scene change or camera switch 152 happens (in display order) is the frame that should use the updated RC statistical model and its corresponding parameters 162. The process of estimating the parameters 162 of the RC statistical model for the new scene 154 depends on the statistical model that is used by the RC. To summarize, if a scene or camera change 152 is detected at frame f, the advantageous embodiment of the improved update of the RC statistical model in the temporal vicinity of said scene or camera change 152 involves two steps:
The following section will describe some method working parallel to the above described apparatuses:
General Remarks
Note that, in all of the abovementioned descriptions and proposals, the terms “frame”, “picture”, “slice”, and “image” may be used interchangeably: a frame usually describes a collection of one or more pictures which, in turn, may also be known as an image, and a slice may cover the entirety or a subset of the same. Note, also, that chroma-component data may be used instead of, or in addition to, luma data, e.g., in (4).
Further Remarks:
Above, different inventive embodiments and aspects have been described in a chapter “Alternative QP/Lambda-from-Rate Estimation”, in a chapter “Perceptually Motivated Improvement of Alternative Estimation”, in a chapter “Improved Detection of Scene Changes or Camera Switches” and in a chapter “Improved update of RC statistical model after scene changes/camera switches”.
Also, further embodiments will be defined by the enclosed claims.
It should be noted that any embodiments as defined by the claims can be supplemented by any of the details (features and functionalities) described in the above mentioned chapters.
Also, the embodiments described in the above mentioned chapters can be used individually, and can also be supplemented by any of the features in another chapter, or by any feature included in the claims.
Also, it should be noted that individual aspects described herein can be used individually or in combination. Thus, details can be added to each of said individual aspects without adding details to another one of said aspects.
It should also be noted that the present disclosure describes, explicitly or implicitly, features usable in video encoder (apparatus for providing an encoded representation of an input video signal). Thus, any of the features described herein can be used in the context of a video encoder.
Moreover, features and functionalities disclosed herein relating to a method can also be used in an apparatus (configured to perform such functionality). Furthermore, any features and functionalities disclosed herein with respect to an apparatus can also be used in a corresponding method. In other words, the methods disclosed herein can be supplemented by any of the features and functionalities described with respect to the apparatuses.
Also, any of the features and functionalities descrbed herein can be implemented in hardware or in software, or using a combination of hardware and software, as will be described in the section “implementation alternatives”.
Implementation Alternatives:
Although some aspects have been described in the context of an apparatus, it is clear that these aspects also represent a description of the corresponding method, where a block or device corresponds to a method step or a feature of a method step. Analogously, aspects described in the context of a method step also represent a description of a corresponding block or item or feature of a corresponding apparatus. Some or all of the method steps may be executed by (or using) a hardware apparatus, like for example, a microprocessor, a programmable computer or an electronic circuit. In some embodiments, one or more of the most important method steps may be executed by such an apparatus.
Depending on certain implementation requirements, embodiments of the invention can be implemented in hardware or in software. The implementation can be performed using a digital storage medium, for example a floppy disk, a DVD, a Blu-Ray, a CD, a ROM, a PROM, an EPROM, an EEPROM or a FLASH memory, having electronically readable control signals stored thereon, which cooperate (or are capable of cooperating) with a programmable computer system such that the respective method is performed. Therefore, the digital storage medium may be computer readable.
Some embodiments according to the invention comprise a data carrier having electronically readable control signals, which are capable of cooperating with a programmable computer system, such that one of the methods described herein is performed.
Generally, embodiments of the present invention can be implemented as a computer program product with a program code, the program code being operative for performing one of the methods when the computer program product runs on a computer. The program code may for example be stored on a machine readable carrier.
Other embodiments comprise the computer program for performing one of the methods described herein, stored on a machine readable carrier.
In other words, an embodiment of the inventive method is, therefore, a computer program having a program code for performing one of the methods described herein, when the computer program runs on a computer.
A further embodiment of the inventive methods is, therefore, a data carrier (or a digital storage medium, or a computer-readable medium) comprising, recorded thereon, the computer program for performing one of the methods described herein. The data carrier, the digital storage medium or the recorded medium are typically tangible and/or non-transitionary.
A further embodiment of the inventive method is, therefore, a data stream or a sequence of signals representing the computer program for performing one of the methods described herein. The data stream or the sequence of signals may for example be configured to be transferred via a data communication connection, for example via the Internet.
A further embodiment comprises a processing means, for example a computer, or a programmable logic device, configured to or adapted to perform one of the methods described herein.
A further embodiment comprises a computer having installed thereon the computer program for performing one of the methods described herein.
A further embodiment according to the invention comprises an apparatus or a system configured to transfer (for example, electronically or optically) a computer program for performing one of the methods described herein to a receiver. The receiver may, for example, be a computer, a mobile device, a memory device or the like. The apparatus or system may, for example, comprise a file server for transferring the computer program to the receiver.
In some embodiments, a programmable logic device (for example a field programmable gate array) may be used to perform some or all of the functionalities of the methods described herein. In some embodiments, a field programmable gate array may cooperate with a microprocessor in order to perform one of the methods described herein. Generally, the methods may be performed by any hardware apparatus.
The apparatus described herein may be implemented using a hardware apparatus, or using a computer, or using a combination of a hardware apparatus and a computer.
The apparatus described herein, or any components of the apparatus described herein, may be implemented at least partially in hardware and/or in software.
The methods described herein may be performed using a hardware apparatus, or using a computer, or using a combination of a hardware apparatus and a computer.
The methods described herein, or any components of the apparatus described herein, may be performed at least partially by hardware and/or by software.
While this invention has been described in terms of several embodiments, there are alterations, permutations, and equivalents which will be apparent to others skilled in the art and which fall within the scope of this invention. It should also be noted that there are many alternative ways of implementing the methods and compositions of the present invention. It is therefore intended that the following appended claims be interpreted as including all such alterations, permutations, and equivalents as fall within the true spirit and scope of the present invention.
Number | Date | Country | Kind |
---|---|---|---|
21168230.7 | Apr 2021 | EP | regional |
This application is a continuation of copending International Application No. PCT/EP20221059944, filed Apr. 13, 2022, which is incorporated herein by reference in its entirety, and additionally claims priority from European Application No. 21168230.7, filed Apr. 13, 2021, which is also incorporated herein by reference in its entirety.
Number | Date | Country | |
---|---|---|---|
Parent | PCT/EP2022/059944 | Apr 2022 | US |
Child | 18379865 | US |