The present disclosure relates to the encoding and decoding of picture or sequence of pictures, also called video.
More specifically, the present disclosure offers a technique for post-processing picture units, such as prediction units or decoded units, at the encoding or at the decoding side, aiming at improving their quality and/or accuracy and improving the coding efficiency.
Such technique according to the present disclosure could be implemented in a video encoder and/or a video decoder complying with any video codec standardization, including for example HEVC, SHVC, HEVC-Rext and other HEVC extensions.
This section is intended to introduce the reader to various aspects of art, which may be related to various aspects of the present disclosure that are described and/or claimed below. This discussion is believed to be helpful in providing the reader with background information to facilitate a better understanding of the various aspects of the present disclosure. Accordingly, it should be understood that these statements are to be read in this light, and not as admissions of prior art.
The range of an original video content (i.e. minimum and maximum values of a sample of the original video content) is generally known and/or determined by the encoder.
Some extreme values of the range could be reserved for special use. For instance, the ITU-R Recommendation BT.709 (commonly known by the abbreviation Rec. 709) uses “studio-swing” levels where reference black is defined as 8-bit code 16 and reference white is defined as 8-bit code 235. Codes 0 and 255 are used for synchronization, and are prohibited from video data. Eight-bit codes between 1 and 15 provide “footroom”, and can be used to accommodate transient signal content such as filter undershoots. Eight-bit codes 236 through 254 provide “headroom”, and can be used to accommodate transient signal content such as filter overshoots and specular highlights. Bit-depths deeper than 8 bits are obtained by appending least-significant bits. The 16 . . . 235 range for R, G, B or luma, or 16 . . . 240 range for chroma, originated with ITU Rec. 601, are known as being the “normal range” opposed to the 0 . . . 255 range known as the “full range”.
In another use case, the pictures sample range values of an original video content is known because the content creator intentionally limited the minimum and maximum values for luma and chroma, or because the content creation process is known to limit the component values in a particular range.
In another use case, a pre-processing module can be used to compute the original content histogram and to determine the range limits.
The original range limits are thus known at the encoding side.
However, the video encoding process changes the original range limits.
More specifically, the video encoder allows the compression of the original video content, in order to reduce significantly the data amount in the encoded video stream. However, the reconstructed/decoded picture samples may be not strictly identical to the original ones due to lossy compression. Consequently, if the range of an original picture sample was (minorig,maxorig), the range of the reconstructed/decoded picture sample can be (minrec,maxrec), with minrec<minorig and/or maxrec>maxorig.
Most of the video codecs, like MPEG-2, AVC, SVC, HEVC, SHVC, etc, perform fixed a priori (or “normal range”) min/max clipping tests using minthres=0 and maxthres=(1<<bitdepth)−1, where bitdepth is the number of bits used to represent one picture sample component. For example, if bitdepth=8, then maxthres=255.
When clipping with this method, the original range limits constraint (e.g. Rec. 709) may be violated. Consequently, non-respect of the original range limits can impact the reconstructed/decoded pictures quality/accuracy.
In addition, since the reconstructed/decoded picture samples may be used as predictor for subsequent picture samples (intra or inter prediction), this reconstructed/decoded picture sample inaccuracy may propagate in the pictures, leading to encoding drift artifacts or encoding inefficiency.
It would hence be desirable to provide a technique for encoding and/or decoding a sequence of pictures that would be more efficient.
The present disclosure relates to a method for encoding a sequence of pictures into a video stream, the method comprising:
The present disclosure thus proposes a new technique for encoding efficiently a sequence of at least one picture, by “in-loop” post-processing a picture unit, such as a decoded unit or a prediction unit, in a decoding loop of the encoding method (“in-loop” means that a reconstructed post-processed picture unit may be used as prediction for another picture unit in case of intra prediction or that a reconstructed post-processed picture may be stored in a decoding picture buffer and used as reference picture for inter-prediction).
Such post-processing has one or more parameters of the type “post-processing parameters”, which are determined from the values of a second color component of the picture unit, and is applied to a first color component of the picture unit (which is different from the second color component). The present disclosure thus proposes to use a cross-component post-processing in order to improve the accuracy and/or quality of the picture units.
In particular, when a color component of the picture unit has been post-processed, the post-processed component could be used as a post-processing parameter to post-process other components of the picture unit, or other picture units.
For example, the first and second color components belong to the Y, U, V components, or to the R, G, B components.
According to an embodiment of the disclosure, encoding said at least one parameter defined as a function of the first color component comprises encoding a set of points of said function (also denoted as correspondence function p).
In particular, encoding a set of points of said function comprises:
Advantageously, the function can be approximated with another interpolating function of the encoded set of points, such as a polynomial function. In this way, such correspondence function can be transmitted to a video decoder and used by the video decoder to process the decoded unit at the decoding side in a similar way than at the encoding side. Such encoding of the correspondence function aims at reducing the amount of data transmitted to the video decoder.
According to another embodiment of the disclosure, said at least one post-processing belongs to the group comprising:
According to a particular embodiment, the correspondence function associating such parameter with the first color component is obtained by:
The present disclosure also pertains to an encoding device for encoding a sequence of pictures into a video stream, comprising a communication interface configured to access said sequence of pictures and at least one processor configured to:
Such a device, or encoder, can be especially adapted to implement the encoding method described here above. It could of course comprise the different characteristics pertaining to the encoding method according to an embodiment of the disclosure, which can be combined or taken separately. Thus, the characteristics and advantages of the device are the same as those of the encoding method and are not described in more ample detail.
In addition, the present disclosure relates to a method for decoding a video stream representative of a sequence of pictures, the method comprising:
The present disclosure thus offers a new technique for decoding efficiently a video stream, by post-processing the picture units.
The characteristics and advantages of the decoding method are the same as those of the encoding method and are not described in more ample detail.
In particular, such decoding method proposes to use a cross-component post-processing in order to improve the accuracy and/or quality of the picture units, such as decoded units or prediction units.
According to an embodiment of the disclosure, decoding at least one parameter defined as a function of the first color component comprises decoding a set of points representative of said function (also denoted as correspondence function p).
In particular, decoding at least one parameter defined as a function of the first color component further comprises interpolating values between two points of the set of points, in order to reconstruct the function.
The present disclosure also pertains to a decoding device for decoding a video stream representative of a sequence of pictures, comprising a communication interface configured to access said at least one video stream and at least one processor configured to:
Once again, such a device, or decoder, can be especially adapted to implement the decoding method described here above. It could of course comprise the different characteristics pertaining to the decoding method according to an embodiment of the disclosure, which can be combined or taken separately. Thus, the characteristics and advantages of the device are the same as those of the decoding method and are not described in more ample detail.
Another aspect of the disclosure pertains to a computer program product downloadable from a communication network and/or recorded on a medium readable by computer and/or executable by a processor comprising software code adapted to perform an encoding method and/or a decoding method, wherein the software code is adapted to perform the steps of at least one of the methods described above.
In addition, the present disclosure concerns a non-transitory computer readable medium comprising a computer program product recorded thereon and capable of being run by a processor, including program code instructions for implementing the steps of at least one of the methods previously described.
Certain aspects commensurate in scope with the disclosed embodiments are set forth below. It should be understood that these aspects are presented merely to provide the reader with a brief summary of certain forms the disclosure might take and that these aspects are not intended to limit the scope of the disclosure. Indeed, the disclosure may encompass a variety of aspects that may not be set forth below.
The disclosure will be better understood and illustrated by means of the following embodiment and execution examples, in no way limitative, with reference to the appended figures on which:
In
It is to be understood that the figures and descriptions of the present disclosure have been simplified to illustrate elements that are relevant for a clear understanding of the present disclosure, while eliminating, for purposes of clarity, many other elements found in typical encoding and/or decoding devices.
5.1 General Principle
A general principle of the disclosure is to apply a post-processing to a prediction unit or a decoded unit, i.e. more generally a picture unit, in order to improve the quality and/or accuracy of the picture unit, at the encoding side and/or at the decoding side.
Such post-processing could be applied to a color component of the picture unit, but takes into account another color component of the picture unit, also called “dual component”.
Such post-processing could be for example:
In the following, the words “reconstructed” and “decoded” can be used interchangeably. Usually, “reconstructed” is used on the encoder side while “decoded” is used on the decoder side.
The main steps of the method for encoding a sequence of pictures into a video stream, and of decoding a video stream, are illustrated respectively in
In the following the method is disclosed with respect to a decoded unit but may also be applied to a prediction unit. In the latter case when encoding/decoding a coding unit the post-processed prediction unit is used.
As illustrated in
In step 11, at least one of the coding unit is encoded. It should be noted that a prediction unit may be obtained in step 11, and used for the coding of the coding unit.
In order to improve the encoding of the coding unit, the encoder implements at least one decoding loop. Such decoding loop implements a decoding of the coding unit in step 12, to obtain a decoded unit. It should be noted that the prediction unit used in step 11 is used for the decoding of the coding unit.
In step 13, at least a first color component and a second color component of the decoded unit are obtained. Such color components belongs for example to the RGB components, or to the YUV components. For example, a coding unit is a pixel comprising one or several components. For color video, each pixel usually comprises a luma component Y, and two chroma components U and V.
It will be understood that, although the terms first and second may be used herein to describe various color components, these color components should not be limited by these terms. These terms are only used to distinguish one color component from another. For example, a first color component could be termed “a component” or “a second color component”, and, similarly, a second color component could be termed “another component” or “a first color component” without departing from the teachings of the disclosure.
In step 14, at least one post-processing f is applied to the second color component of the decoded unit, responsive to at least one parameter Pval of said post-processing (denoted as a post-processing parameter) and to said first color component, said at least one parameter being defined as a function of the first color component (denoted as a correspondence function p).
According to a first example, a first correspondence function can associate a first post-processing parameter, like the minimum value of the U component of the decoded unit, with the values of a first color component of the decoded unit, like the Y component of the decoded unit. In other words, the first correspondence function defines the minimum value of the U component of the decoded unit for each value of the Y component of the decoded unit. A second correspondence function can associate a second post-processing parameter, like the maximum value of the U component of the decoded unit, with the values of a first color component of the decoded unit, like the Y component of the decoded unit.
Then, at least one post-processing is applied to the second color component of the decoded unit, such post-processing having the post-processing parameter as parameter.
According to the first example, the post-processing f can be a clipping function, in which the post-processing parameter(s) define(s) the minimum and/or maximum values of the second color component of the decoded unit.
According to a second example, the post-processing f can be an offset function, in which the post-processing parameter(s) define(s) an offset to be added to the second color component of the decoded unit.
According to a third example, the post-processing f can be a linear filtering function, in which the post-processing parameter(s) define the coefficients of the filter to be applied on the second component of the picture unit.
Such post-processing parameter Pval is encoded, e.g. in the form of the correspondence function p, and stored and/or transmitted to a decoder in step 15.
In step 21, at least one coding unit of said sequence, encoded in the video stream, is decoded, to obtain a decoded unit. A prediction unit may be obtained in step 21, and used for the decoding of the coding unit.
In step 22, at least a first color component and a second color component of the decoded unit are obtained. As mentioned above, such color components belong for example to the RGB components, or to the YUV components.
In step 23, at least one parameter Pval of a post-processing of the second component (denoted as a post-processing parameter) is decoded, said at least one parameter being defined as a function of the first color component (denoted as a correspondence function). Such parameter Pval can be encoded and transmitted to a decoder by the encoder in the form of the correspondence function which associates the at least one parameter with the values of the first color component of the decoded unit. Note that the step 23 may be placed between steps 21 and 22, or before step 21.
In step 24, at least one post-processing f is applied to the second color component of the decoded unit, such post-processing having said post-processing parameter as parameter.
The proposed solution thus allows to improve the quality and/or accuracy of the decoded units, at the encoder and/or decoder sides.
According to a specific embodiment, the present disclosure proposes to send clipping value for one component (e.g. Y) as a function of another component (e.g. U). For instance, min and max clipping values for the luma component Y are encoded, and min and max values for each chroma components are encoded as functions of the luma component. In that way, the clipping values are specialized for each value of the other component and the clipping correction is more precise.
5.2 Disclosure of a Specific Embodiment
In this section, effort will be made more particularly to describe how the encoder and the decoder work with a post-processing of the clipping type. The invention disclosure is of course not limited to this particular type of post-processing, and other post-processing may be concerned, such as the linear filtering or the adding of an offset.
Let's consider for example the encoder illustrated in
The encoder can implement the classical transform step 31, quantization step 32, and high-level syntax and entropy coding step 33.
In order to improve the encoding of the coding units, the encoder can also implement at least one decoding loop. To this end, the encoder can implement the classical inverse quantization step 34, inverse transform step 35, and intra prediction 36 and/or inter prediction 37.
Once a coding unit is reconstructed/decoded, the color components of the picture unit are obtained, and at least one correspondence function associating post-processing parameter value(s) with the values of a color component of the picture unit is defined.
For example, as illustrated in
According to a first example, illustrated in
For example, a first correspondence function associating a minimum value of the Urec component with each value of the Yrec component, and a second correspondence function associating a maximum value of the Urec component with each value of the Yrec component are defined by the tables below:
According to a second example, illustrated in
In other words, for a given component (ex: Y, U or V), in a given reconstructed/decoded frame, slice, GOP or group of macroblocks, . . . , the clipping parameters can be defined as a function of another (dual) component value.
Such clipping parameters can be used by a post-processing of the clipping type, as illustrated in
For example, such post-processing can be done before and/or after the in-loop filters 38 (clipping 381), and/or after the intra prediction 36 (clipping 361), and/or after the inter motion compensation prediction 37 (clipping 371).
If we consider the first example, the clipping 361 following the intra prediction 36, for instance, aims at applying a post-processing f1 to the U component of the prediction unit, denoted as Upred, depending on the post-processing parameter Pval corresponding to the minimum value used by the clipping, said post-processing parameter Pval depending on the value of the Yrec component. The U component of the prediction unit after post-processing is denoted Upost: Upost=f1(Upred,Pval).
Such post-processing f1 could be a clipping function such as:
The proposed solution thus insures that the values of the post-processed unit have the same range values as the picture unit (or at least closer to the picture unit range values), and possibly as the coding unit.
The same processing could be applied to other components of a picture unit. In particular, when a component has been post-processed, the post-processed component could be used to process other components. For example, the first correspondence function (or table) could be updated with the value of the U component after post-processing (Upost), and then used for the post-processing of the U component of another picture unit, or for the post-processing of another component.
In other words, if another correspondence function associating a post-processing parameter of the type minimum value of the Urec component with each value of the Vrec component is defined, the Vrec component could be post-processed only after the Urec component has been post-processed. The ordering of the several post-processing stage can be defined in advance or it can be signaled in the bitstream.
The post-processing parameters in the form of correspondence functions and/or the post-processing functions, can be encoded and sent to the decoder, to improve the decoded pictures in a similar way than at the encoding side. It is thus proposed according to at least one embodiment to send/encode/decode the clipping parameters of one color component, as a function of another (dual) color component.
In order to reduce the amount of information transmitted from the encoder to the decoder, the correspondence functions can be approximated.
If we consider the second correspondence function p2 for instance, such function could be approximated using a piecewise linear function, as illustrated in
The encoding of the post-processing parameters in the form of correspondence functions and/or the post-processing functions, can be implemented by the entropy coding step 33.
Let's now consider for example the decoder illustrated in
At the decoding side, the video stream representative of a sequence of pictures is decoded.
Such decoder can implement the classical high-level syntax and entropy decoding step 41, inverse quantization step 42, and inverse transformation step 43.
The post-processing parameters in the form of correspondence functions and/or the post-processing functions, can also be decoded, in the entropy decoding step 41.
Once a coding unit is decoded, the color components of the picture unit are obtained, and at least one post-processing parameter(s) is obtained. For example, the first correspondence table is decoded. The decoder thus knows, for each value of the Y component of the picture unit, the minimum value that should take the U component of the picture unit.
Such post-processing parameters can be used by a post-processing of the clipping type, as illustrated in
For example, such post-processing can be done before and/or after the in-loop filters 44 (clipping 441), and/or after the intra prediction 45 (clipping 451), and/or after the inter motion compensation prediction 46 (clipping 461).
If we consider, once again, the clipping 451 following the intra prediction 45, such clipping 451 aims at applying the post-processing f1 to the U component of the prediction unit outputted by the intra prediction 45, denoted as Upred, depending on the post-processing parameter Pval corresponding to the minimum value of the Urec component.
Such clipping function f1 could be expressed as:
When the correspondence functions have been approximated at the encoder side, the decoder side can decode them by first decoding a set of points (like the eleven points of
In the embodiment described above, we considered that the post-processing is a clipping function, and the post-processing parameters are clipping parameters.
However, the invention is not limited to this specific embodiment.
According to another embodiment, the post-processing is an offset function, and the post-processing parameters define offsets (Pval) to be added to the second color component of the picture unit. In other words, in case offsets are transmitted/encoded/decoded, it is proposed to categorize the values of one color component (ex: Y component) with values of another component (ex: U or V component). For example, for values of Yrec component comprised between 0 and 18, the offset to be added to the values of Urec component is 0, for values of Yrec component comprised between 19 and 32, the offset to be added to the values of Urec component is 5, for values of Yrec component comprised between 33 and 64, the offset to be added to the values of Urec component is 3, etc. Such offsets define a correspondence function which can be approximated by a piece-wise linear function.
The post-processing in this case could be expressed as:
U
post
=f(Urec,Pval)=Urec+Pval, where Pval depends on Yrec.
According to another embodiment, the post-processing is a linear filtering function, and the post-processing parameters define the coefficients of the filter to be applied on the second component of the picture unit. For example, Pvali=pi(Yrec) defines the value of the coefficient i of the linear filter of size N, to be applied to the component Urec. If the component Urec is localized at the position x in the picture unit, than the post-processing could be expressed as:
It should also be noted that the correspondence function(s) associating at least one post-processing parameter with the values of the second color component of the picture unit could be determined from the original sequence of pictures/coding units rather than from the picture unit.
In this case, a color histogram can be obtained on the encoding side from the analysis of the original sequence of pictures, and the values of the different color component can be obtained from this histogram, in order to determine at least one correspondence function p. If some color components of the picture unit (for example Urec) are not identical to the color components of the coding unit (for example U), then p(U) will be different from p(Urec). In this case, the correspondence function might be slightly adjusted by the encoder, such that p′(Urec)=p(U), and it is the adjusted correspondence function that should be encoded and stored/transmitted to the decoder.
According to another embodiment, for each component, an index indicating the dual component used for the correspondence function (for example clipping range function or categorization) is encoded directly or differentially.
According to another embodiment, the post-processing parameters may be used as post-processing on the reconstructed pictures only, by the decoder. In this case, the correspondence functions can be encoded in a SEI message, SPS, PPS, or slice header for example.
According to another embodiment, the post-processing parameters may be used in all or only part of post-processing operations in the encoder and/or decoder: motion compensation (prediction), intra prediction, in-loop post-filter processing, etc.
In the previous embodiments, the post-processing method (e.g. a clipping method) uses one component (e.g. Y) to predict the bounds (min and/or max clipping values) of another component (e.g. U or V). As an example, the min m (respectively max M) clipping value of U (or V) may be defined as a function of the collocated value Y: m=f(Y), M=f(Y).
In the case where the function f is determined using original YUV samples (for example as a pre-processing step before encoding a frame) and the function f(Y) is used at the decoder side, there may be a drift since, on the decoder side, only Yrec, i.e. the reconstructed Y, is available. The reconstruction error on Y is going to introduce some error on the bounds of U and V.
In order to overcome this problem, the function f( ) may be determined using the original samples while taking into account a reconstruction error E on Y(E=Yrec−Y):
On
One advantage of this variant is that the clipping on U and V components may be done at any stage in the decoder (for example in the RDO), which usually provides better results.
In a specific embodiment, f( ) and E are encoded in the bitstream, f( ) being determined using the original signal at the encoder.
The same principle applies in the case where the bounds on Y are defined as a function of U or V.
In a variant, the function f( ) may be determined using Yrec instead of the original samples Y. In this case, the clipping will be done as a post-process, i.e. after the reconstruction of the whole luma frame (but still in the encoding process of the frame so that the clipped frame may be used during prediction of other frames). Indeed, the function f can only be determined after the encoding of the whole frame.
In the case where, Y is coded before the U and V components, then the clipping on U or V may be applied in the coding loop of U or V.
In a specific embodiment, f( ) is encoded in the bitstream, f( ) being determined using the original signal at the encoder.
While not explicitly described, the present embodiments and variants may be employed in any combination or sub-combination.
5.3 Devices
Such an encoding device comprises at least:
Such a decoding device comprises at least:
Such encoding device and/or decoding device could each be implemented according to a purely software realization, purely hardware realization (for example in the form of a dedicated component, like in an ASIC, FPGA, VLSI, . . . ), or of several electronics components integrated into a device or in a form of a mix of hardware elements and software elements.
The flowchart and/or block diagrams in the Figures illustrate the configuration, operation and functionality of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s).
For example, the one or more processors 92 may be configured to execute the various software programs and/or sets of instructions of the software components to perform the respective functions of: obtaining at least a first color component and a second color component of a picture unit, applying at least one post-processing to the second color component of the picture unit responsive to at least one parameter of said post-processing and to said first color component, and encoding said at least one parameter, in accordance with embodiments of the invention.
The one or more processors 102 may be configured to execute the various software programs and/or sets of instructions of the software components to perform the respective functions of: obtaining at least a first color component and a second color component of a picture unit, decoding at least one parameter of a post-processing of the second component, and applying said at least one post-processing to the second color component of the picture unit responsive to said at least one decoded parameter and to said first color component, in accordance with embodiments of the invention.
It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, or blocks may be executed in an alternative order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of the blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems that perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
As will be appreciated by one skilled in the art, aspects of the present principles can be embodied as a system, method, computer program or computer readable medium. Accordingly, aspects of the present principles can take the form of an entirely hardware embodiment, an entirely software embodiment (including firmware, resident software, micro-code, and so forth), or an embodiment combining software and hardware aspects that can all generally be referred to herein as a “circuit,” “module”, or “system.” Furthermore, aspects of the present principles can take the form of a computer readable storage medium. Any combination of one or more computer readable storage medium(s) may be utilized.
A computer readable storage medium can take the form of a computer readable program product embodied in one or more computer readable medium(s) and having computer readable program code embodied thereon that is executable by a computer. A computer readable storage medium as used herein is considered a non-transitory storage medium given the inherent capability to store the information therein as well as the inherent capability to provide retrieval of the information therefrom. A computer readable storage medium can be, for example, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. It is to be appreciated that the following, while providing more specific examples of computer readable storage mediums to which the present principles can be applied, is merely an illustrative and not exhaustive listing as is readily appreciated by one of ordinary skill in the art: a portable computer disc, a hard disc, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or Flash memory), a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
Number | Date | Country | Kind |
---|---|---|---|
15306369.8 | Sep 2015 | EP | regional |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/EP2016/070569 | 9/1/2016 | WO | 00 |