This application claims priority benefit under 35 U.S.C. § 119(d) from European Patent Application No. EP 20 306 569.3, filed Dec. 15, 2020, the disclosure of which is incorporated by reference herein in its entirety.
The present disclosure relates to the field of image processing, in particular image encoding for video stream compression.
Film grain is usual in old motion pictures dating from the times movies were shot with a film camera. The grain, which was a result of chemical effects of the analog film used in the film camera, is no longer present in videos captured with a digital camera. Nevertheless it is common for film makers to add computer-generated grain in materials captured with a digital camera in order to reproduce the esthetic of old movies through the presence of film grain. In view of its random nature, grain present in an image can be modeled as an additive noise, and processed as such.
Video data is often source encoded so as to decrease the amount of resources necessary for its transmission and/or storage in memory. Various video coding or compression standards that have been developed over the recent years, such as H.264/AVC, H.265/HEVC or MPEG-2, may be used for that purpose.
Even though grain may have been added in post-production to a video content, such film grain is considered as part of the video data to be encoded or compressed likewise other components of the video data. However, due to its random nature, grain is difficult to compress with a good efficiency.
With a known video coding or compression scheme, preserving the grain requires a very high bitrate. Conversely, when using a reasonable bitrate, say for a broadcast use case, grain cannot be preserved properly. It is either washed out, or partly removed thus generating undesirable visual artefacts and temporal instability.
Therefore it is desirable to improve the efficiency of video encoding/compression of video data that includes grain by preserving the grain information during the encoding/compression, while preserving performance of the encoder (encoding gain).
There is therefore a need for providing an improved video processing scheme and video encoder and/or decoder implementing the same that address at least some of the above-described drawbacks and shortcomings of the conventional technology in the art.
It is an object of the present subject disclosure to provide an improved video processing scheme and apparatus implementing the same.
Another object of the present subject disclosure is to provide an improved video encoding or compression and/or video decoding or decompression scheme and apparatuses implementing the same.
Another object of the present subject disclosure is to provide an improved video encoding and/or decoding scheme and apparatuses implementing the same for alleviating the above-described drawbacks and shortcomings of conventional video encoding/decoding schemes, in particular with respect to video encoding/decoding schemes of an input video stream to be encoded that contains film grain and/or random noise.
To achieve these objects and other advantages and in accordance with the purpose of the present subject disclosure, as embodied and broadly described herein, in one aspect of the present subject disclosure, a method of processing an image, a digital video frame, or more generally digital video data, is proposed. The proposed method comprises: generating a decoded image by first encoding the image, wherein the first encoding the image comprises decoding encoded data generated based on the image; determining estimates of parameters of a parametric model of noise contained in the image based on the decoded image; and including the estimates of parameters of the parametric model of noise in an encoded stream generated by second encoding the image.
The proposed scheme advantageously avoids the use of a denoised image, so that it alleviates the drawbacks of using denoising for encoding an image. Instead, noise model parameters are computed using the input image and a decoded image obtained through the decoding of an encoded version of the input image. As image coding algorithms that use predictive coding typically use decoded image data determined by decoding an encoded version of an input image, the proposed process advantageously leverages processing operations which are part of most encoding schemes. Instead of being used only for purposes of encoding an input image, the decoded image determined as part of encoding the input image may according to the present subject disclosure also be used for purposes of determining estimates of parameters of a noise model for the input image.
Avoiding the use of a denoising scheme significantly decreases the computation complexity of the processing of the image, which may particularly be advantageous in the context of video encoding, where it is desirable to decrease the encoding latency and/or complexity.
In one or more embodiments, the including the estimates of parameters of the parametric model of noise in the encoded stream may comprise: inserting the estimates of parameters of the parametric model of noise in the encoded stream. Alternatively, the including the estimates of parameters of the parametric model of noise in the encoded stream may comprise: updating default noise parameters comprised in the encoded stream.
In one or more embodiments, the including the estimates of parameters of the parametric model of noise in the encoded stream may be performed as part of the second encoding of the image.
In one or more embodiments, the second encoding may be performed according to encoding parameters that are different from the encoding parameters according to which the first encoding is performed.
In one or more embodiments, the first encoding and the second encoding may be performed as part of different encoding processing instances.
In one or more embodiments, the first encoding and the second encoding may be performed as part of a same encoding processing instance. In such embodiments, the encoding of the image may be performed by an encoder configured to output the encoded stream further to the encoding of the image and to output the decoded image further to the encoding of the image.
In one or more embodiments, the parametric model may be configured to model grain contained in the image.
In one or more embodiments, the second encoding may be performed according to an AV1 or AVC/H.264 encoding.
In another aspect of the present subject disclosure, an apparatus is proposed, which comprises a processor, and a memory operatively coupled to the processor, wherein the apparatus is configured to perform a method as proposed in the present subject disclosure.
In one or more embodiments, the proposed apparatus may further be configured to perform the first encoding and the second encoding as part of a same encoding processing instance for encoding the image, and the apparatus may further be configured to output the encoded stream further to the encoding of the image and to output the decoded image further to the encoding of the image. In one or more embodiments, the apparatus may further be configured to perform the first encoding and the second encoding as part of a same encoding processing instance for encoding the image, and the apparatus may further be configured to output the encoded stream as part of the encoding processing instance and to output the decoded image as part of the encoding processing instance.
In one or more embodiments, the proposed apparatus may further comprise an encoder engine and a noise parameters computation engine, the encoder engine may be configured to perform the first encoding and the second encoding as part of a same encoding processing instance for encoding the image, the encoder engine may further configured to output the encoded stream as part of the encoding processing instance and to output the decoded image to the noise parameters computation engine as part of the encoding processing instance, and the noise parameters computation engine may be configured to determine the estimates of parameters of the parametric model of noise contained in the image based on the decoded image.
In one or more embodiments, the proposed apparatus may further comprise a noise parameters insertion and/or update engine operatively coupled with the encoder engine and the noise parameters computation engine, and configured to include in the encoded stream received from the encoder engine the estimates of parameters of the parametric model of noise contained in the image received from the noise parameters computation engine.
In other embodiments of the proposed apparatus, the noise parameters computation engine may be configured to output the estimates of parameters of the parametric model of noise contained in the image to the encoder engine, and the encoder engine may be configured to include in the encoded stream the estimates of parameters of the parametric model of noise contained in the image received from the noise parameters computation engine.
In yet another aspect of the present subject disclosure, a video encoder is proposed, which is configured to encode video content comprising a plurality of images, and comprises an apparatus as proposed in the present subject disclosure configured to perform a method as proposed in the present subject disclosure.
In yet another aspect of the present subject disclosure, a non-transitory computer-readable medium encoded with executable instructions which, when executed, causes an apparatus comprising a processor operatively coupled with a memory, to perform a method as proposed in the present subject disclosure, is proposed.
For example, in embodiments, the present subject disclosure provides a non-transitory computer-readable medium encoded with executable instructions which, when executed, causes an apparatus comprising a processor operatively coupled with a memory, to process an image, a digital video frame, or more generally digital video data, by performing the generating, by the processor, a decoded image by first encoding the image, wherein the first encoding the image comprises decoding encoded data generated based on the image, the determining, by the processor, estimates of parameters of a parametric model of noise contained in the image based on the decoded image, and the including, by the processor, the estimates of parameters of the parametric model of noise in an encoded stream generated by second encoding the image.
In yet another aspect of the present subject disclosure, a computer program product comprising computer program code tangibly embodied in a computer readable medium, said computer program code comprising instructions to, when provided to a computer system and executed, cause said computer to perform a method as proposed in the present subject disclosure, is proposed.
In another aspect of the present subject disclosure, a data set representing, for example through compression or encoding, a computer program as proposed herein, is proposed.
It should be appreciated that the present invention can be implemented and utilized in numerous ways, including without limitation as a process, an apparatus, a system, a device, and as a method for applications now known and later developed. These and other unique features of the system disclosed herein will become more readily apparent from the following description and the accompanying drawings.
The present subject disclosure will be better understood and its numerous objects and advantages will become more apparent to those skilled in the art by reference to the following drawings, in conjunction with the accompanying specification, in which:
For simplicity and clarity of illustration, the drawing figures illustrate the general manner of construction, and descriptions and details of well-known features and techniques may be omitted to avoid unnecessarily obscuring the discussion of the described embodiments of the invention. Additionally, elements in the drawing figures are not necessarily drawn to scale. For example, the dimensions of some of the elements in the figures may be exaggerated relative to other elements to help improve understanding of embodiments of the present invention. Certain figures may be shown in an idealized fashion in order to aid understanding, such as when structures are shown having straight lines, sharp angles, and/or parallel planes or the like that under real-world conditions would likely be significantly less symmetric and orderly. The same reference numerals in different figures denote the same elements, while similar reference numerals may, but do not necessarily, denote similar elements.
In addition, it should be apparent that the teaching herein can be embodied in a wide variety of forms and that any specific structure and/or function disclosed herein is merely representative. In particular, one skilled in the art will appreciate that an aspect disclosed herein can be implemented independently of any other aspects and that several aspects can be combined in various ways.
The present disclosure is described below with reference to functions, units, engines, block diagrams and flowchart illustrations of the methods, systems, and computer program according to one or more exemplary embodiments. Each described function, unit, engine, block of the block diagrams and flowchart illustrations can be implemented in hardware, software, firmware, middleware, microcode, or any suitable combination thereof. If implemented in software, the functions, units, engines, blocks of the block diagrams and/or flowchart illustrations can be implemented by computer program instructions or software code, which may be stored or transmitted over a computer-readable medium, or loaded onto a general purpose computer, special purpose computer or other programmable data processing apparatus to produce a machine, such that the computer program instructions or software code which execute on the computer or other programmable data processing apparatus, create the means for implementing the functions described herein.
Embodiments of computer-readable media includes, but are not limited to, both computer storage media and communication media including any medium that facilitates transfer of a computer program from one place to another. As used herein, a “computer storage media” may be any physical media that can be accessed by a computer or a processor. In addition, the terms “memory” and “computer storage media” include any type of data storage device, such as, without limitation, a hard drive, a flash drive or other flash memory devices (e.g. memory keys, memory sticks, key drive), CD-ROMs or other optical data storage devices, DVDs, magnetic disk data storage devices or other magnetic data storage devices, data memory components, RAM, ROM and EEPROM memories, memory cards (smart cards), solid state drive (SSD) memories, and any other form of medium able to be used to transport or store or memorize data or data structures able to be read by a computer processor, or a combination thereof. Furthermore, various forms of computer-readable media may transmit or carry instructions to a computer, such as a router, a gateway, a server, or any data transmission equipment, whether this involves wired transmission (via coaxial cable, optical fibre, telephone wires, DSL cable or Ethernet cable), wireless transmission (via infrared, radio, cellular, microwaves) or virtualized transmission equipment (virtual router, virtual gateway, virtual tunnel end, virtual firewall). According to the embodiments, the instructions may comprise code in any computer programming language or computer program element, such as, without limitation, the languages of assembler, C, C++, Visual Basic, HyperText Markup Language (HTML), Extensible Markup Language (XML), HyperText Transfer Protocol (HTTP), Hypertext Preprocessor (PHP), SQL, MySQL, Java, JavaScript, JavaScript Object Notation (JSON), Python, and bash scripting.
Unless specifically stated otherwise, it will be appreciated that throughout the following description discussions utilizing terms such as processing, computing, calculating, determining, or the like, refer to the action or processes of a computer or computing system, or similar electronic computing device, that manipulate or transform data represented as physical, such as electronic, quantities within the registers or memories of the computing system into other data similarly represented as physical quantities within the memories, registers or other such information storage, transmission or display devices of the computing system.
The terms “comprise,” “include,” “have,” and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements is not necessarily limited to those elements, but may include other elements not expressly listed or inherent to such process, method, article, or apparatus.
Additionally, the word “exemplary” is used herein to mean “serving as an example, instance, or illustration”. Any embodiment or design described herein as “exemplary” is not necessarily to be construed as preferred or advantageous over other embodiments or designs. The terms “in particular”, “for example”, “example”, “typically” are used in the present description to denote examples or illustrations of non-limiting embodiments that do not necessarily correspond to preferred or advantageous embodiments with respect to other possible aspects or embodiments.
The terms “operationally coupled”, “coupled”, “mounted”, “connected” and their various variants and forms used in the present description refer to couplings, connections and mountings that may be direct or indirect, and comprise in particular connections between electronic equipment or between portions of such equipment that allow operations and modes of operation as described in the present description. In addition, the terms “connected” and “coupled” are not limited to physical or mechanical connections or couplings. For example, an operational coupling may include one or more wired connection(s) and/or one or more wireless connection(s) between two or more items of equipment that allow simplex and/or duplex communication links between the equipment or portions of the equipment. According to another example, an operational coupling or a connection may include a wired-link and/or wireless coupling for allowing data communications between a server of the proposed system and another item of equipment of the system.
The methods proposed in the present subject disclosure may be implemented by any video encoder or video codec configured for encoding and/or decoding images (or frames) of input video data containing grain, film grain and/or noise, such as, for example a video encoder and/or decoder compliant with any of the H.261, MPEG-1 Part 2, H.262, MPEG-2 Part 2, Alliance for Open Media (AOM) AV1, H.264/AVC, H.265/HEVC, MPEG-4 Part 2, SHVC (Scalable HEVC), H.266/VVC, and MPEG-5 EVC specifications or standards, whether in their existing versions and/or their evolutions, as the case may be adapted for implementing one or more embodiments of the proposed methods.
In the following, embodiments of the proposed methods, apparatuses and computer programs are described for the exemplary processing of grain in an image. However, it will be appreciated by those having ordinary skill in the relevant art that other types of noise in images or videos, such as, for example, sensor noise, may be processed in place of or in addition to the grain noise which is given by way of example only according to embodiments of the present subject disclosure.
In some conventional encoding/decoding schemes, such as specified for the AV1 video codec specified by the Alliance for Open Media (AOM), the grain information of an input image (for example of an input video) is not directly encoded, together with other data in the input image, but instead processed using an analysis/synthesis processing.
Using an analysis/synthesis method allows compressing grain through a parametric model. Once analyzing grain information in an input image has provided estimates of grain parameters, the grain can be removed from the input image prior to encoding such image using the chosen encoding scheme (e.g. AV1). The process of removing the grain is sometimes referred to as “denoising”, and the image, video, or content from which the grain has been removed is referred to as “denoised”. The efficiency of the encoding scheme can be preserved by only encoding the denoised input image or video, while the encoded stream resulting from the encoding of the input image or video comprises the compressed or encoded image or video, together with the corresponding grain parameters.
Shown on
As shown on
Also shown on
That is, at the decoder side, the denoised image is decoded, and a synthetic noise pattern (e.g. a synthetic grain pattern) is generated based on the grain parameters estimated at the encoder side and combined with (for example added to) the decoded image. Therefore, contrary to the principles of image encoding/decoding which are based on fidelity of the decoded image to the input image to be encoded, the resulting noisy or grainy image or video is different from the source, while still being visually similar.
Some standard video codec specifications, such as, for example, the AV1 codec specification, define a syntax of grain parameters and specify a grain synthesis scheme to be used as normative/mandatory tools. As another example, the H.264/AVC standard defines an optional syntax for sending grain parameters along with encoded video data. In AV1 an auto-regressive model is considered, while in AVC/H.264, two different models are proposed, including an auto-regressive model.
Even though the following focuses on a non-limiting example based on the AV1 specification, a person of ordinary skill in the art would understand that the proposed processes, apparatuses and computer programs of the present subject disclosure may be implemented based on any video or image processing or coding standard or specification which addresses noise and/or grain processing in an image, in particular through a noise analysis/synthesis model, such as, for example, an auto-regressive (AR) model, and that such proposed processes, apparatuses and computer programs of the present subject disclosure are not limited to the use of any specific video or image processing or coding standard/specification, and in particular to AV1, which is provided as an example only. For example, the proposed processes, apparatuses and computer programs of the present subject disclosure may also be implemented using the AVC/H.264 auto-regressive model.
Further, even though the following focuses on a non-limiting example based on the AV1 specification which uses an AR parametric model, a person of ordinary skill in the art would understand that the proposed processes, apparatuses and computer programs of the present subject disclosure may be implemented based on any analysis/synthesis model suitable for modelling noise (e.g. grain), and that such proposed processes, apparatuses and computer programs of the present subject disclosure are not limited to the use of any specific noise model, and in particular to an AR parametric model for noise, which is provided as an example only.
In the present subject disclosure, reference is made to the AV1 specification which is available at the URL https://aomediacodec.github.io/av1-spec/av1-spec.pdf, and incorporated by reference in its entirety in the present subject disclosure.
As part of the analysis/synthesis scheme used therein for grain processing, the AV1 video codec specifications consider an auto-regressive (AR) process for modeling a film grain pattern of an input image, according to which each noise pixel is modelled by a random variable based on the noise pixels in a causal neighborhood. In AVC/H.264, two models are proposed, including an auto-regressive model. The grain model specified for AV1 assumes that each input video frame can be modelled as a combination (e.g. an addition) of a signal without noise and a noise frame (also referred to as a noise image) corresponding to a zero-average noise that follows an AR process. Each pixel of the noise frame can be modelled by a random variable which depends on random variables respectively associated with previously generated neighboring pixels of the noise frame.
X
i,j=ϵi,j+φ1Xi,j−1+φ2Xi,j−2+φ3Xi−1,j+2+φ4Xi−1,j+1+φ5Xi−1,j+φ6Xi−1,j−1+φ7Xi−1,j−2+φ8Xi−2,j+2+φ9Xi−2,j+1+φ10Xi−2,j+φ11Xi−2,j−1+φ12Xi−2,j−2,
wherein φ1, . . . , φp are the auto-regressive parametric noise model linear combination parameters, and ϵi,j a random number following an independent identically distributed gaussian distribution of standard deviation σ. Another parameter of the AR parametric noise model that may be used is one or more seeds of the pseudo-number generator usable for generating the random number ϵi,j. In some embodiments, the seed may be chosen at the encoder side and transmitted along with the auto-regressive parametric noise model linear combination parameters φ1, . . . , φp and the standard deviation σ, to be used at the decoder side with a pseudo-random number generator, for example as specified by the AV1 codec specification.
In one or more embodiments, the parameters of the AR parametric noise model may therefore include the number P of linear combination coefficients (φ1, φ2, . . . , φp) , linear combination coefficients (φ1, φ2, . . . , φp), a standard deviation σ or variance of the Gaussian noise to be used for drawing the random number ϵi,j, and a seed value.
Based on these parameters, a current pixel value Xi,j may be generated by adding a random value of a Gaussian noise of standard deviation σ (generated based on the seed value) to a linear combination of the P pixel values previously generated for the P pixels in the causal neighborhood weighted by the linear combination coefficients (φ1, φ2, . . . , φp).
The grain synthesis algorithm specified for AV1 has been defined with computing efficiency in mind. The computing cost of synthesizing a full-size grain image can indeed become high in the case of processing an ultra HD image. As such, the AV1 specification for grain synthesis at the decoder does not require generating grain pixels for the full pixel size of the image (or video frame) under consideration. Instead, a noise template (which may also be referred to as a noise “pattern”) of a predetermined and smaller size than that of the image, e.g. of size 64×64 pixels, may be generated, and used as a template for generating noise patches of a smaller size, such as 32×32 pixels. A plurality of noise patches may be chosen at random coordinates in the 64×64 pixels noise template, and copied on the image which will have been previously divided into blocks of the same size as the noise patches (e.g. blocks of 32×32 pixels). FIG. 3 illustrates the selection of a patch (e.g. of size 32×32 pixels) at random coordinates in a grain template (e.g. of size 64×64 pixels).
Therefore, advantageously, the noise of an image to be processed may be analyzed based on a noise template of a smaller size than that of the image, in order to lower the processing time and/or complexity of this aspect of the processing. For example, the noise of an image may be analyzed based on an analysis scheme applied to one or more 64×64 pixel noise templates determined based on the image.
As illustrated on
A processing sequence for grain processing, at the encoder (analysis) or at the decoder (synthesis), the grain pixels of the image may define a sequence according to which grain pixels are analyzed/synthesized one after another. Because the pixels of the image may be analyzed at the encoder according to a predetermined sequence to determine the grain model parameters, the grain pixels may be synthesized at the decoder according to the same processing sequence. For example, a raster scan sequence may scan the pixels of the image starting from the pixel located at the leftmost upper corner of the image (represented by a pixel matrix), and progress to the pixel adjacent to the previously scanned pixel located to the right. At the end of a line of pixels, the sequence proceeds to the next line scanning the pixels from the left to the right. An image divided into blocks may be scanned in the same manner, with blocks being processed according to the processing sequence instead of pixels.
As discussed above, a noise template can therefore be progressively generated according to a processing sequence of grain analysis/synthesis, such as a raster scan sequence as illustrated on
Further, the AV1 specifications provide that the grain parameters can be adjusted as a function of the luminance, to better model the behavior of actual film grain. For instance, film grain tends to be less prominent at high luminance, compared to medium luminance. AV1 allows specifying several luminance intervals and then send a scaling factor per luminance interval, from which a piecewise linear scaling function can be obtained. Although not specifically mentioned in the remaining of this document, the proposed methods, apparatuses and computer programs are compatible with this feature.
Although the above-described analysis/synthesis scheme provides a good solution to the grain compression problem, the synthesis algorithm defined in the AV1 specification has some drawbacks due to the use of a noise template based on which a grain image is generated by pixel-copying of noise patches randomly selected in the noise template.
Various methods have been developed for estimating the grain parameters when using an AR model for the grain, that is, estimating the AR model parameters (including the AR model linear combination parameters (φ1, . . . , φp) and AR model variance parameter (σ)). For example, the Yule-Walker method may be used, as this method is well suited for the estimation of AR model parameters, its complexity is reasonable, and it usually provides satisfactory results.
Conventional grain analysis methods have in common the use of denoising or, as the case may be, a “de-graining” to generate a denoised image such as illustrated in
This functional architecture requires implementing a preprocessing engine for denoising, which has a cost in terms of computing resources.
In addition, the efficiency of a conventional grain analysis method that uses denoising typically relies heavily on the efficiency of the denoising operation. As a consequence, if the denoising of the original image is not efficient enough, the remaining noise in the denoised image impairs the accuracy of the grain parameters estimation. As a result, the grain strength may be underestimated, which may lead to unsatisfactory grain restitution for the end-user. Further, some denoisers may generate undesirable artifacts in the original image signal, such as softness. Moreover, highly accurate denoisers are very complex and involve high CPU usage.
The present subject disclosure provides image processing methods that address these drawbacks of conventional grain analysis methods, and proposes a new paradigm for grain parameters estimation which offers specific advantages as described below.
As shown in
In some embodiments, the first encoding and decoding engine 14a may perform operations that are similar to the operations performed by the encoder engine 13a with respect to the encoding to generate encoded data followed by the decoding of the encoded data.
In some embodiments, the computation complexity of operations performed by the first encoding and decoding engine 14a and/or the processing latency induced by the first encoding and decoding engine 14a may advantageously be reduced by using at the first encoding and decoding engine 14a an encoding algorithm which is faster and/or of lower quality than the encoding algorithm used at the encoder engine 13a of the encoder 10a.
As illustrated on
In embodiments in which the noise model used for modelling the grain is an AR parametric model, the grain parameters data may include data corresponding to linear combination parameters (φ1, φ2, . . . , φp), noise variance (or, depending on the embodiment, standard deviation) parameter (σ), and seed parameter of the AR model.
As discussed above, the grain parameters computation engine 12a may perform operations that are similar to that described in relation to the grain parameters computation engine 12 of
In one or more embodiments, the first encoding and decoding engine 14a may be configured to operate on the input image as a denoiser according to a noise model, and as such output data that represents an estimate of the noise according to the model. For example, in some embodiments, the first encoding and decoding unit 14a may be configured as a video encoder which operates as a motion compensated temporal filter (MCTF). MCTF is a technology which is widely accepted as an efficient denoising scheme. In some embodiments, the use of MCTF may advantageously be combined with scaling of the transformed coefficients (obtained in the encoding loop by applying a transform on pixel residuals) in order to provide a finer control of the denoising of the input image.
As will be appreciated by the skilled person, the use of a first encoding and decoding engine 14a that is different from the main encoder engine 13a advantageously provides some flexibility to chose an encoding scheme used at the first encoding and decoding unit 14a, which may be different from the encoding scheme used at the main encoder engine 13a. This flexibility may advantageously be leveraged by choosing an encoding algorithm used at the first encoding and decoding unit 14a that is better suited for providing desired performances for the grain parameters estimation given other constraints (e.g. latency, computation complexity, etc.) imposed on the encoder side. For example, as discussed above, this flexibility may be leveraged by choosing an encoding algorithm used at the first encoding and decoding unit 14a that is faster, of lower quality but also lower complexity, and/or that is better suited for its denoising performances than the encoding scheme used at the main encoder engine 13a.
In one or more embodiments, the grain parameters computation unit 12a may be configured to obtain grain parameters based on the input image data and the decoded image data, and to provide such grain parameters to the encoder 13a.
As discussed above in connection with the encoded stream generated by the encoder 13 and output from the encoder 10 illustrated by
Depending on the embodiment, various methods for obtaining the grain parameters may be implemented at the grain parameters computation unit 12a, which may or not involve the computation of such grain parameters or the generation of corresponding grain templates, thereby avoiding the drawbacks associated therewith. For example, in one or more embodiments, the grain parameters computation unit 12a may be configured to operate in the same manner as the grain parameters computation unit 12 of
Therefore, the grain parameters computation unit of the present subject disclosure (such as shown on
As shown in
The encoder 10b illustrated on
In contrast with the encoder engine 13 of
As discussed above, the encoding of the input image may comprise decoding encoded data generated based on the input image. In one or more embodiments, the encoding of the input image performed by the encoder engine 13b may be performed according to a conventional encoding scheme which generates a decoded image based on encoded data, for example according to a standard video codec specification, such as H.261, MPEG-1 Part 2, H.262, MPEG-2 Part 2, AV1, H.264/AVC, H.265/HEVC, MPEG-4 Part 2, SHVC (Scalable HEVC), and H.266/VVC, and MPEG-5 EVC standards. Such a conventional encoding scheme may typically involve dividing the input image into a set of blocks and encoding the blocks according to a block encoding sequence. The encoding of a block may comprise the generating encoded data based on the block, for example according to a predictive coding algorithm, with the encoded data being generated based on previously encoded then decoded blocks or images. That is, in some embodiments, a decoded image resulting from the decoding of an encoded version of the input image may typically be generated as part of an encoding scheme using predictive coding implemented by the encoder engine 13b for the encoding of the input image. In the encoder 10b illustrated on
The use of the main encoder engine 13b advantageously reduces the complexity of the encoder 10b, as compared to the encoder 10a, at the cost of the loss in flexibility for selecting an encoding scheme for purposes of generating a decoded version of the input image that may be different from the encoding scheme used for producing the encoded stream, as discussed above.
As illustrated on
In embodiments in which the noise model used for modelling the grain is an AR parametric model, the grain parameters data may include data corresponding to linear combination parameters (φ1, φ2, . . . , φp), noise variance (or, depending on the embodiment, standard deviation) parameter (σ), and seed parameter of the AR model.
In one or more embodiments, the grain parameters computation engine 12b may perform operations that are similar to that described in relation to the grain parameters computation engine 12 of
In one or more embodiments, the grain parameters computation unit 12b may be configured to obtain grain parameters based on the input image data and the decoded image data, and to provide such grain parameters to a grain parameters insertion or update engine 15b. As illustrated by
The grain parameters insertion and/or update engine 15b may be configured to receive grain parameters from the grain parameters computation engine 12b, and to receive an encoded stream from the encoder engine 13b. In embodiments where the received encoded stream does not include grain parameters data, the grain parameters insertion and/or update engine 15b may further be configured to insert in the received encoded stream grain parameters data, based on the grain parameters data received from the grain parameters computation engine 12b. In embodiments where the received encoded stream includes default grain parameters data, the grain parameters insertion and/or update engine 15b may further be configured to update the default grain parameters included in the encoded stream, based on the grain parameters data received from the grain parameters computation engine 12b. Therefore, depending on whether the encoded stream includes or not grain parameters data, the grain parameters insertion and/or update engine 15b may be configured to insert and/or replace grain parameters data in the encoded stream, based on the grain parameters data received from the grain parameters computation engine 12b. As a result, in one or more embodiments, the grain parameters insertion and/or update engine 15b may be configured to generate an encoded stream output from the encoder 10b that includes grain parameters data, as discussed above in connection with the encoded stream output from the encoder 10 illustrated by
As shown in
The encoder 10c illustrated on
In contrast with the encoder engine 13 of
As discussed above, the encoding of the input image may comprise decoding encoded data generated based on the input image. In one or more embodiments, the encoding of the input image performed by the encoder engine 13c may be performed according to a conventional encoding scheme which generates a decoded image based on encoded data, for example according to a standard video codec specification, such as H.261, MPEG-1 Part 2, H.262, MPEG-2 Part 2, AV1, H.264/AVC, H.265/HEVC, MPEG-4 Part 2, SHVC (Scalable HEVC), and H.266/VVC, and MPEG-5 EVC standards. Such a conventional encoding scheme may typically involve dividing the input image into a set of blocks and encoding the blocks according to a block encoding sequence. The encoding of a block may comprise the generating encoded data based on the block, for example according to a predictive coding algorithm, with the encoded data being generated based on previously encoded then decoded blocks or images. That is, in some embodiments, a decoded image resulting from the decoding of an encoded version of the input image may typically be generated as part of an encoding scheme using predictive coding implemented by the encoder engine 13c for the encoding of the input image. In the encoder 10c illustrated on
The use of the main encoder engine 13c advantageously reduces the complexity of the encoder 10c, as compared to the encoder 10a, at the cost of the loss in flexibility for selecting an encoding scheme for purposes of generating a decoded version of the input image that may be different from the encoding scheme used for producing the encoded stream, as discussed above. The use of the main encoder engine 13c also advantageously reduces the complexity of the encoder 10c, as compared to the encoder 10b, in view of the absence of a grain parameters insertion and/or update unit at the encoder 10c, and the fact that no default grain parameters may be needed at the encoder 10c.
As illustrated on
In embodiments in which the noise model used for modelling the grain is an AR parametric model, the grain parameters data may include data corresponding to linear combination parameters (φ1, φ2, . . . , φp), noise variance (or, depending on the embodiment, standard deviation) parameter (σ), and seed parameter of the AR model.
In one or more embodiments, the grain parameters computation engine 12c may perform operations that are similar to that described in relation to the grain parameters computation engine 12 of
In one or more embodiments, and in contrast with the encoder 10b of
The use of the main encoder engine 13c also advantageously reduces the complexity of the encoder 10c, as compared to the encoder 10b, in view of the absence of a grain parameters insertion and/or update unit at the encoder 10c, and the fact that no default grain parameters may be needed at the encoder unit 13c.
Therefore, in one or more embodiments (such as illustrated by
The proposed schemes, for example according to the embodiments illustrated by
An image which is to be processed (e.g. encoded) for noise analysis (e.g. grain analysis), for example of a video sequence or more generally of a media stream (an image of a video sequence may sometimes be referred to as a “video frame”), is considered as input of the proposed process, and may be indifferently referred to in the following as the “original” image or the “input” image.
As shown on
In one or more embodiments, the decoded image may be used for determining 102 estimates of parameters of a parametric model of noise contained in the input image. Using a decoded image obtained during encoding of the input image, instead of a denoised image, for determining estimates of noise model parameters advantageously avoids the drawbacks associated with the denoising processing of the input image, including the ones discussed above.
The determined estimates of parameters of noise model may then be included 103 in an encoded stream generated by second encoding of the image.
In some embodiments, the parametric model may be chosen to model grain contained in the input image, in which case the proposed scheme may advantageously be used to process grain. For example, as discussed above, an AR parametric model may be chosen for its suitability to model grain contained in an image.
In some embodiments, the second encoding may comply with any image and/or video coding standard. For example, the second encoding may be performed according to an AV1 or AVC/H.264 encoding. In such embodiments, the proposed scheme may advantageously allow using a standard video codec as specified, and therefore be backward compatible with existing video coders. The second encoding may be performed by an encoder which is distinct from the encoder performing the first encoding, in which cases the proposed scheme may impact the first encoding, while the second encoding is chosen based on encoding constraints, for example as a standard video codec.
In some embodiments, the including the estimates of parameters of the parametric model of noise in the encoded stream may be performed as part of the second encoding of the image. In particular, an encoder engine configured for performing the second encoding of the image may be configured for generating, based on input image data and noise parameters data, an encoded stream as part of the encoding of the input image.
In some embodiments, the second encoding may be performed according to encoding parameters that are different from the encoding parameters according to which the first encoding is performed. For example, as illustrated by the example encoder of
In addition, in some embodiments, the first encoder and second encoder engines may be separate, in that the first encoding and the second encoding may be performed as part of different encoding processing instances. Depending on the embodiment, the first encoding processing instance and the second encoding processing instance may be run, in full or in part, in parallel.
In other embodiments, the second encoding may be performed according to encoding parameters that are the same as encoding parameters according to which the first encoding is performed. For example, as illustrated by the example encoder of
In one or more embodiments, the estimates of parameters of the parametric model of noise may be inserted in the encoded stream. This is advantageous in the cases where a single encoder is used to generate the decoded image and to generate the encoded stream, and the encoded stream generated by the encoder does not include noise parameters data.
In other embodiments, the estimates of parameters of the parametric model of noise may be used to update default noise parameters comprised in the encoded stream. This is advantageous in the cases where a single encoder is used to generate the decoded image and to generate the encoded stream, and the encoded stream generated by the encoder includes noise parameters data which however needs updating in order to take into account the determined estimates of parameters of the parametric model of noise.
An exemplary architecture of an apparatus, such as a processing node or a video encoder, according to the present subject disclosure is illustrated on
The apparatus 1, which may comprise one or more computers, includes a control engine 2, an image processing engine 3, a data interface engine 4, and a memory 5. In the architecture illustrated on
In some embodiments, the image processing engine 3 is configured to perform various aspects of embodiments of one or more of the proposed methods for image processing as described herein, such as determining noise parameters related to a received input image, and generating an encoded stream based on the input image. Depending on the embodiment, the image processing engine 3 may be configured to include a representative encoder 10a, 10b such as illustrated on
In one or more embodiments, the image processing engine 3 may comprise an encoder engine and a noise parameters computation engine, with the encoder engine configured to perform the first encoding and the second encoding described above as part of a same encoding processing instance for encoding an input image, and the encoder engine being further configured to output the encoded stream as part of the encoding processing instance, and to output the decoded image to the noise parameters computation engine as part of the encoding processing instance. The noise parameters computation engine may be configured to determine the estimates of parameters of the parametric model of noise contained in the image based on the decoded image as described above.
For example, in some embodiments, the image processing engine 3 may comprise a first encoding and decoding engine, a noise parameters computation engine, and an encoder engine configured as described above in connection with the encoder 10a of
In other embodiments, the image processing engine 3 may comprise an encoder engine, a noise parameters computation engine, and a noise parameters insertion and/or update engine configured as described above in connection with the encoder 10b of
In yet other embodiments, the image processing engine 3 may comprise an encoder engine, a noise parameters computation engine configured as described above in connection with the encoder 10c of
In particular, in some embodiments, the image processing engine 3 may comprise an encoder engine configured to perform a first encoding and a second encoding according to embodiments of the proposed method as part of a same encoding processing instance for encoding the image. In some embodiments, the encoder engine may further be configured to generate and output an encoded stream further to the encoding of an input image and to generate and output a decoded image further to the encoding of the input image.
In some embodiments, the data interface engine 4 is configured to receive an input image, possibly as part of an input video data, and to output an encoded stream, under the control of the image processing engine 3.
The control engine 2 includes a processor, which may be any suitable microprocessor, microcontroller, Field Programmable Gate Arrays (FPGA), Application Specific Integrated Circuits (ASIC), Digital Signal Processing chip, and/or state machine, or a combination thereof. According to various embodiments, one or more of the computers can be configured as a multi-processor computer having multiple processors for providing parallel computing. The control engine 2 may also comprise, or may be in communication with, computer storage media, such as, without limitation, the memory 5, capable of storing computer program instructions or software code that, when executed by the processor, causes the processor to perform the elements described herein. In addition, the memory 5 may be any type of data storage or computer storage medium, coupled to the control engine 2 and operable with the data interface engine 4 and the image processing engine 3 to facilitate management of data stored in association therewith, such as, for example, a cache memory, a data farm, a data warehouse, a data mart, a datacenter, a data cloud, or a combination thereof.
In embodiments of the present subject disclosure, the apparatus 1 is configured for performing one or more of the image processing methods described herein. The apparatus 1 may in some embodiments be included in an image encoder or, depending on the embodiments, in a video encoder or a video codec.
It will be appreciated that the apparatus 1 shown and described with reference to
The proposed method may be used for the processing, for purposes of encoding or compression of input data which may correspond, depending on the embodiment, to an image, a picture, a video frame, or video data.
While the invention has been described with respect to preferred embodiments, those skilled in the art will readily appreciate that various changes and/or modifications can be made to the invention without departing from the spirit or scope of the invention as defined by the appended claims.
Although this invention has been disclosed in the context of certain preferred embodiments, it should be understood that certain advantages, features and aspects of the systems, devices, and methods may be realized in a variety of other embodiments. Additionally, it is contemplated that various aspects and features described herein can be practiced separately, combined together, or substituted for one another, and that a variety of combination and sub-combinations of the features and aspects can be made and still fall within the scope of the invention. Furthermore, the systems and devices described above need not include all of the modules and functions described in the preferred embodiments.
Information and signals described herein can be represented using any of a variety of different technologies and techniques. For example, data, instructions, commands, information, signals, bits, symbols, and chips can be represented by voltages, currents, electromagnetic waves, magnetic fields or particles, optical fields or particles, or any combination thereof.
Depending on the embodiment, certain acts, events, or functions of any of the methods described herein can be performed in a different sequence, may be added, merged, or left out all together (e.g., not all described acts or events are necessary for the practice of the method). Moreover, in certain embodiments, acts or events may be performed concurrently rather than sequentially.
Number | Date | Country | Kind |
---|---|---|---|
20306569.3 | Dec 2020 | EP | regional |