The present invention relates generally to images. More particularly, an embodiment of the present invention relates to generating a family of reshaping functions for HDR imaging which satisfy both continuity and reversibility constraints.
As used herein, the term ‘dynamic range’ (DR) may relate to a capability of the human visual system (HVS) to perceive a range of intensity (e.g., luminance, luma) in an image, e.g., from darkest grays (blacks) to brightest whites (highlights). In this sense, DR relates to a ‘scene-referred’ intensity. DR may also relate to the ability of a display device to adequately or approximately render an intensity range of a particular breadth. In this sense, DR relates to a ‘display-referred’ intensity. Unless a particular sense is explicitly specified to have particular significance at any point in the description herein, it should be inferred that the term may be used in either sense, e.g. interchangeably.
As used herein, the term high dynamic range (HDR) relates to a DR breadth that spans the 14-15 orders of magnitude of the human visual system (HVS). In practice, the DR over which a human may simultaneously perceive an extensive breadth in intensity range may be somewhat truncated, in relation to HDR.
In practice, images comprise one or more color components (e.g., luma Y and chroma Cb and Cr) wherein each color component is represented by a precision of n-bits per pixel (e.g., n=8). Using linear or gamma luminance coding, images where n ≤ 8 (e.g., color 24-bit JPEG images) are considered images of standard dynamic range, while images where n > 8 may be considered images of enhanced or high dynamic range. HDR images may also be stored and distributed using high-precision (e.g., 16-bit) floating-point formats, such as the OpenEXR file format developed by Industrial Light and Magic.
Most consumer desktop displays currently support luminance of 200 to 300 cd/m2 or nits. Most consumer HDTVs range from 300 to 500 nits with new models reaching 1000 nits (cd/m2). Such conventional displays thus typify a lower dynamic range (LDR), also referred to as a standard dynamic range (SDR), in relation to HDR. As the availability of HDR content grows due to advances in both capture equipment (e.g., cameras) and HDR displays (e.g., the PRM-4200 professional reference monitor from Dolby Laboratories), HDR content may be color graded and displayed on HDR displays that support higher dynamic ranges (e.g., from 1,000 nits to 5,000 nits or more).
In a traditional image pipeline, captured images are quantized using a non-linear opto-electronic function (OETF), which converts linear scene light into a non-linear video signal (e.g., gamma-coded RGB or YCbCr). Then, on the receiver, before being displayed on the display, the signal is processed by an electro-optical transfer function (EOTF) which translates video signal values to output screen color values. Such non-linear functions include the traditional “gamma” curve, documented in ITU-R Rec. BT0.709 and BT. 2020, the “PQ” (perceptual quantization) curve described in SMPTE ST 2084, and the “HybridLog-gamma” or “HLG” curve described in and Rec. ITU-R BT. 2100.
As used herein, the term “reshaping” or “remapping” denotes a process of sample-to-sample or codeword-to-codeword mapping of a digital image from its original bit depth and original codewords distribution or representation (e.g., gamma, PQ, or HLG and the like) to an image of the same or different bit depth and a different codewords distribution or representation. Reshaping allows for improved compressibility or improved image quality at a fixed bit rate. For example, without limitation, forward reshaping may be applied to 10-bit or 12-bit PQ-coded HDR video to improve coding efficiency in a 10-bit video coding architecture. In a receiver, after decompressing the received signal (which may or may not be reshaped), the receiver may apply an inverse (or backward) reshaping function to restore the signal to its original codeword distribution and/or to achieve a higher dynamic range.
Reshaping can be static or dynamic. In static reshaping, a single reshaping function is generated and is being used for a single stream or across multiple streams. In dynamic reshaping, the reshaping function may be customized based on the input video stream characteristics, which can change at the stream level, the scene level, or even at the frame level. Dynamic reshaping is preferable; however, certain devices may not have enough computational power to support it. As appreciated by the inventor here, improved techniques for efficient image reshaping when displaying video content, especially HDR content, are desired.
The approaches described in this section are approaches that could be pursued, but not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section. Similarly, issues identified with respect to one or more approaches should not assume to have been recognized in any prior art on the basis of this section, unless otherwise indicated.
An embodiment of the present invention is illustrated by way of example, and not in way by limitation, in the figures of the accompanying drawings and in which like reference numerals refer to similar elements and in which:
Image reshaping techniques for the efficient coding of images are described herein. In the following description, for the purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the present invention. It will be apparent, however, that the present invention may be practiced without these specific details. In other instances, well-known structures and devices are not described in exhaustive detail, in order to avoid unnecessarily occluding, obscuring, or obfuscating the present invention.
Example embodiments described herein relate to image reshaping. In an embodiment, in an apparatus comprising one or more processors, a processor receives input pairs of reference images in HDR and SDR. Given an initial set of forward reshaping functions generated using the reference HDR and SDR images, sets of output forward and backward reshaping functions are constructed by a) using the initial set of the forward reshaping functions to generate a first set of corresponding backward reshaping functions b) generating a second set of backward reshaping functions, wherein each backward reshaping function is represented using a multi-segment polynomial representation with a common set of pivot points c) generating the output set of backward reshaping functions by optimizing the polynomial representation of the second set of backward reshaping functions to minimize the gap output values between consecutive segments and d) using the output set of backward reshaping functions to generate the output set of forward reshaping functions by minimizing the distance between the reference HDR values and reconstructed HDR values.
In an embodiment to generate an output backward reshaping function wherein gap values between segments are minimized, a processor receives a first set of input images in a first dynamic range (e.g., SDR) and a second set of input images in a second dynamic range (e.g., HDR), wherein corresponding pairs between the first set and the second set of input images represent an identical scene. The processor:
According to some embodiments, there is provided a method for generating a reshaping function with a processor, the method comprising:
The first set of backward reshaping functions may be generated from an initial set of the forward reshaping functions generated using the first set of input images and the second set of input images.
Each backward reshaping function (e.g. of the first set of backward reshaping functions) may be characterized by a reshaping-index parameter and segment parameters in the sense of being defined by a reshaping-index parameter and segment parameters.
Each backward reshaping function may be characterized by a different reshaping-index parameter.
Each backward reshaping function and/or forward reshaping function may be computed for characteristics being different among the input images of the second set of input images. The characteristics may comprise a measure or metric of an average luminance of an input image in the second dynamic range. For example, each backward reshaping function (and forward reshaping function) may be computed for a different average luminance of an input image in the second dynamic range.
According to some embodiments, the method may further comprise generating a set of output forward reshaping functions based on the set of output backward reshaping functions, wherein a forward reshaping function maps pixel codewords from the second codeword representation in the second dynamic range to the first codeword representation in the first dynamic range. Each output forward reshaping function of the set of output forward reshaping functions may be characterized by a reshaping-index parameter (e.g. a different reshaping-index parameter) and segment parameters for a segment-based representation of the output forward reshaping functions with a set of common pivots. Generating an output forward reshaping function corresponding to an output backward reshaping function may comprise: for each input codeword in the second dynamic range identifying a codeword index for the output backward reshaping function which minimizes a difference between codewords generated by the backward reshaping function and the input codeword.
According to some embodiments, the method may further comprise generating a new forward reshaping function by interpolating between two forward reshaping functions of the set of output forward reshaping functions. Thereby, a new forward reshaping function may be generated based on two forward reshaping functions characterized by different reshaping-index parameters.
According to some embodiments, the method may further (or alternatively) comprise generating a new backward reshaping function by interpolating between two backward reshaping functions of the output set of backward reshaping functions. Thereby, a new backward reshaping function may be generated based on two backward reshaping functions characterized by different reshaping-index parameters.
Interpolating between the two backward reshaping functions or the two forward reshaping functions may comprise generating an interpolated set of polynomial coefficients for each segment of the new backward or forward reshaping function, based on a respective set of polynomial coefficients of the two backward or forward reshaping functions. The interpolated set of polynomial coefficients may be generated using linear interpolation, e.g. between corresponding polynomial coefficients of the set of polynomial coefficients of the two backward or forward reshaping functions.
The method may further comprise selecting the two backward reshaping functions or forward reshaping functions based on a measure of an average luminance of an input HDR image (e.g. in the second dynamic range) to be encoded. Each output backward reshaping function and/or each output forward reshaping function may be computed for a different average luminance of an input image in the second dynamic range. The two backward reshaping functions or forward reshaping functions may be selected such that the average luminance of the input HDR image to be encoded lies between the respective measures of average luminance for which the selected two backward or forward reshaping functions are computed. The method may further comprise encoding the input HDR image using the new forward reshaping function.
As described in Ref. [1] and Ref. [2],
Under this framework, given reference HDR content (120), corresponding SDR content (134) (also to be referred as reshaped content) is encoded and transmitted in a single layer of a coded video signal (144) by an upstream encoding device that implements the encoder-side codec architecture. The SDR content is received and decoded, in the single layer of the video signal, by a downstream decoding device that implements the decoder-side codec architecture. Backward reshaping metadata (152) is also encoded and transmitted in the video signal with the SDR content so that HDR display devices can reconstruct HDR content based on the SDR content and the backward reshaping metadata.
As illustrated in
The forward reshaping function in 132 is generated using a forward reshaping function generator 130 based on the reference HDR images (120). Given the forward reshaping function, forward reshaping mapping (132) is applied to the HDR images (120) to generate reshaped SDR base layer 134. In addition, a backward reshaping function generator 150 may generate a backward reshaping function which may be transmitted to a decoder as metadata 152.
Examples of backward reshaping metadata representing/specifying the optimal backward reshaping functions may include, but are not necessarily limited to only, any of: inverse tone mapping function, inverse luma mapping functions, inverse chroma mapping functions, lookup tables (LUTs), polynomials, inverse display management coefficients or parameters, etc. In various embodiments, luma backward reshaping functions and chroma backward reshaping functions may be derived/optimized jointly or separately, may be derived using a variety of techniques as described in Ref. [2].
The backward reshaping metadata (152), as generated by the backward reshaping function generator (150) based on the SDR images (134) and the target HDR images (120), may be multiplexed as part of the video signal 144, for example, as supplemental enhancement information (SEI) messaging.
In some embodiments, backward reshaping metadata (152) is carried in the video signal as a part of overall image metadata, which is separately carried in the video signal from the single layer in which the SDR images are encoded in the video signal. For example, the backward reshaping metadata (152) may be encoded in a component stream in the coded bitstream, which component stream may or may not be separate from the single layer (of the coded bitstream) in which the SDR images (134) are encoded.
Thus, the backward reshaping metadata (152) can be generated or pregenerated on the encoder side to take advantage of powerful computing resources and offline encoding flows (including but not limited to content adaptive multiple passes, look ahead operations, inverse luma mapping, inverse chroma mapping, CDF-based histogram approximation and/or transfer, etc.) available on the encoder side.
The encoder-side architecture of
In some embodiments, as illustrated in
In addition, a backward reshaping block 158 extracts the backward reshaping metadata (152) from the input video signal, constructs the optimal backward reshaping functions based on the backward reshaping metadata (152), and performs backward reshaping operations on the decoded SDR images (156) based on the optimal backward reshaping functions to generate the backward reshaped images (160) (or reconstructed HDR images). In some embodiments, the backward reshaped images represent production-quality or near-production-quality HDR images that are identical to or closely/optimally approximating the reference HDR images (120). The backward reshaped images (160) may be outputted in an output HDR video signal (e.g., over an HDMI interface, over a video link, etc.) to be rendered on an HDR display device.
In some embodiments, display management operations specific to the HDR display device may be performed on the backward reshaped images (160) as a part of HDR image rendering operations that render the backward reshaped images (160) on the HDR display device.
Reshaping can be static or dynamic. In static reshaping, a single reshaping function is generated and is being used for a single stream or across multiple streams. In dynamic reshaping, the reshaping function may be customized based on the input video stream characteristics, which can change at the stream level, the scene level, or even at the frame level. For example, in an embodiment, without limitation, one may generate reshaping functions according to a metric of the average luminance value in a frame or a scene, to be referred as L1-mid. For example, without limitation, in an embodiment with PQ-coded, RGB data, L1-mid may represent the average of max(R,G,B) values among all RGB pixels in a region of interest in a frame. In another embodiment, for YCbCr or ICtCp coded data, L1-mid may represent the average of all Y or I values in a region of interest in a frame (e.g., computing the average may exclude letterbox or sidebar areas in a frame).
Thus, for a 12-bit system, one may pre-construct all 4,096 possible functions; however, such an approach is time consuming and also rather impractical due to the huge memory requirements in a real system. In an embodiment, one may select to build a smaller set of L (L < 2bitdepth) curves as basis reshaping functions, store them in memory, and then generate additional functions for missing mid-luminance values by interpolating between the available L functions during run-time. This may be referred to as “Scalable-Static mode,” since it combines a small set of statically-generated reshaping curves to generate the full set. For example, in an embodiment, for 10-bit signals, one may generate 13 basis forward functions for average luminance values in the set comprising {768 1024 1280 1536 1792 2048 2304 2560 2816 3072 3328 3584 3840}.
In an embodiment, to save bandwidth, a backwards reshaping function may be approximated using piece-wise linear or non-linear polynomials, where in such representation, polynomial segments are separated by pivot points. To facilitate the interpolation of polynomial coefficients (and avoid extra computation), in an embodiment, the pivot points in all precomputed reshaping functions should be aligned (e.g., common). This allows for much simpler interpolation among polynomials without concern about their pivot points.
Consider a database containing a reference (or “master”) HDR set and multiple SDR sets of images or video clips generated for different L1-mid values, that is, for each HDR image there is a set of corresponding SDR images using YcOc1 color data (e.g., YCbCr data, with y=Y, c0 = Cb, and c1 = Cr). The SDR images can be generated from the HDR images either manually, with the help of a color grader, automatically, using automatic color mapping algorithms, or by using a combination of computer tools and human interaction.
Let
denote the data values of the i-th pixel at the j-th frame or picture in the SDR database set which is generated from the l-th L1-mid mapping. Denote the number of pixels in each frame as P. Let the bit depth be denoted as HDR_bitdepth for HDR images and as SDR_bitdepth in the SDR images, then the number of possible codeword values in the HDR and SDR signals is given by NV= 2HDR_bitdepth and NS= 2SDR-bitdepth, respectively.
As used herein, the term “unnormalized pixel value” denotes a value in [0, 2B-1], where B denotes the bit depth of the pixel values (e.g., B = 8, 10, or 12 bits). As used herein, the term “normalized pixel value” denotes a pixel value in [0, 1).
In some embodiments, instead of operating at the pixel level, one may operate with average pixel values. For example, one may divide the input signal codewords into M non-overlapping bins (e.g., M = 16, 32, or 64) with equal interval wb (e.g., for 16-bit input data, wb = 65,536/M) to cover the whole normalized dynamic range (e.g., (0,1]). Then, instead of operating with pixel values one may operate with the average pixel values within each such bin. Denote the number of bins in the SDR and the HDR signal as MS and MV, respectively, and denote their corresponding intervals as wbs and wbV. When operating at the pixel level, then, MS = NS, and/or MV = NV.
Denote the minimal and maximal luma values within the j-th frame or picture in the HDR database as
Denote the minimal and maximal luma values within the j-th frame in the SDR database as
Before building the family of basis reshaping functions, say, with common pivot points, one needs to build individual reshaping functions. In an embodiment and without limitation, such functions are built using the histogram- or cumulative density function- (CDF) matching approach used in Ref. [1-3]. For completeness, the algorithm is also described herein. The key steps include: a) collecting statistics (histograms) for the images in the database, b) generating a cumulative density function (CDF) for each set, c) applying CDF matching (Ref. [3]) to generate a reshaping function and d) clipping and smoothing the reshaping function. These steps are depicted in pseudocode in Tables 1 and 2.
In Table 2, the function y = clip3(x, Min, Max) is defined as:
In Table 2, the CDF-matching step (STEP 4) can be simply explained as follows. Consider that an SDR codeword xs corresponds to a specific CDF value c in the CDF Cs,(l) and that an HDR codeword xv also corresponds to the same specific CDF value c in the CDF cv,(l), it is then determined that SDR value s = xs should be mapped to HDR value xv. Alternatively, in STEP 4, for each SDR value (xs), one computes the corresponding SDR CDF value (say, c), and then tries to identify via simple linear interpolation from existing HDR CDF values the HDR value (xv) for which cv,(l) = c.
By repeating the steps in Table 1 and 2 for each L1-mid value of interest one can generate the first set of forward reshaping functions,
mapping an HDR input image (or frame) to a corresponding SDR output according to its L1-mid value or any other characteristics that was used to generate the l-th SDR database.
Given the original set of individual forward reshaping functions,
one may now generate corresponding individual backward reshaping functions
where SDR codewords are mapped to reconstructed HDR codewords. An example process is depicted in Table 3.
At this stage, there exist a set of backward reshaping functions
and a set of forward reshaping functions
In an embodiment, the luma backward reshaping function may be represented/approximated by a multi-piece polynomial approximation (e.g., using eight second-order polynomials), where polynomial segments are separated by pivot points. To enable the interpolation among the existing L1-mid functions, it is desirable to have common pivot points for all
functions. In an embodiment, the set of common pivots is generated based on the techniques described in Ref. [6], which is summarized below.
For each SDR codeword b, compute its normalized value
Denote the pivot points as {λm}, m = 0, 1, ..., K, where K denotes the total number of segments. For example, [λm,λm+1), denotes the m-th polynomial segment. The m-th polynomial is selected when the input value, b, is within λm and λm+1. In an embodiment, both b and λm values are integers in [0, 2B ), where B denotes the SDR bit depth. The m-th second-order polynomial of the l-th L1-mid reshaping function is used to approximate the input backward reshaping function, generated earlier, as:
To reduce the gap between two nearby polynomials, an overlapped constraint may be applied to smooth the transition around the pivot points. The overlapped window size is denoted as Wm. To achieve a minimal gap, the design optimization goal can be formulated as
for
In equation (3), the pivot points may be bounded by specific communication constraints (e.g., a valid range of SMPTE 274M codeword values) which can be expressed as a lowerbound,
and as upper bound,
where values below
and above
will be clipped. For example, for 8-bit signals, the broadcast-safe area is [16, 235] instead of [0, 255].
Using an overlapped window, denote the extended points as
then, in a matrix representation,
and equation (3) can be expressed as
Under a least square optimization, a solution is given by
Then, the overall problem to solve is given by:
In Step 2, in regard to checking for implementation constraints, some embodiments may restrict the accuracy of the polynomial coefficients (e.g., to signed 7-bit or 8-bit integer values, and the like).
As discussed, the above joint optimization might not output a smooth backward reshaping function, without jumps near the pivot points. In an embodiment, another round of optimization to minimize the gaps may needs to be performed as follows.
Consider the pivot point λm+1, where the ending points of two polynomials (the m-th and m+1 polynomials) are supposed to connect together. When predicting HDR values using the received SDR codewords, at location at λm, using a second-order polynomial, the equation of applying the m-th polynomial is:
For the rest of the codewords in the m-th segment, predicted HDR values are computed as:
To reduce the gap between pivots, in an embodiment, the goal is to adjust the polynomial coefficients of the m-th segment so that the revised m-th polynomial satisfies output values at both
for the original polynomials, that is, equation (7) and
A second-order polynomial is fully defined by three coefficients, thus, given only the two equations (7) and (9), solving for three unknowns is an under-determined problem. Assuming one of the revised polynomial coefficients
is known
the remaining ones
can be obtained fromequations (7) and (9) as a closed-form solution of a system with two equations and two unknowns. For example, with
presumed known, the solution for the other two is given by
Since
is actually unknown, one can formulate this problem as an optimization problem to find the optimal coefficients
such that the sum of differences between the original HDR value and the new HDR value using the new polynomial coefficients is minimized, that is:
There is no need to include point λmin the optimization since the equations have already passed this point. In an embodiment, without limitation, the optimal solution can be searched among a range of values for
For the last segment, the (m+1)-th segment is outside the last pivot point. In an embodiment, the curve after the last point may be a constant, extending from the last output value from the previous polynomial. Therefore, for m = K, one can set
For the first segment (e.g., m = 0), if the video signal range is in full range (that is, in [0, 1]), the pivot point λ0 will be 0. In this case,
must be equal to the original coefficient. That is, for
Assuming
is known,
can be obtained from equation (9) as
An example implementation of the gap-reducing algorithm is depicted in Tables 5 (for m=0) and Table 6 (for m > 0). When m = K, the constraints of equation (12) also apply.
Adjusting the forward reshaping functions for reversibility
After reducing the gap between all consecutive segments, the set of final backward reshaping functions is determined. Since the gap-reduction algorithm changes the backward reshaping functions, one needs to re-build the forward reshaping functions, guaranteeing proper reversibility. This can be accomplished by reverse tracing of the backward reshaping functions.
For the l-th polynomial, one can reconstruct the output backward reshaping function
using, the computed polynomial coefficients
for all m:
The corresponding updated forward reshaping function,
can be built by searching the codeword index which minimizes the difference between codewords generated by the backward reshaping function and the original HDR codeword. For each input HDR codeword b, to get optimal reconstruction, the ideal backward reshaping output should be as close to b as possible. Given a b value, one assumes its mapped value from forward reshaping will be k, where k can be within the entire valid SDR codeword range. By taking each valid SDR codeword through backward reshaping
one can find the corresponding backward mapped HDR value b. Among all those backward mapped HDR values,
one can find the one,
having minimal difference from the original input HDR codeword b. In other words, for an input b, the forward reshaping mapping should map b to
An example process is depicted in Table 7.
In step 210, using the first set of forward reshaping functions, one constructs a first set of backward reshaping functions, e.g., using the process depicted in Table 3. Next, in step 215, one builds a second set of backward reshaping shaping functions by a) applying a piece-wise approximation to the first set of backward reshaping shaping functions under the constrain that all functions in this set share a common set of pivots. An example process is provided in Table 4.
To improve image quality and reduce visual artifacts, in step 220, the polynomial representation of the second set of forward reshaping functions is further optimized to reduce the distance between values in neighboring pivot points, thus generating an output set of backward reshaping functions. An example process is provided in Tables 5 and 6.
Given this set of output forward reshaping functions (with optimized gap), step 225 generates an output set of forward reshaping functions under the constrain that the distance between codewords of the reference HDR input and a reconstructed HDR input (using an output backward reshaping function) is minimized. An example process is described in Table 7.
Returning back to processing block 220,
Step 305: Initialization. This step initializes the distortion (D) to a large number, and given the original polynomial coefficients for the m-th and m+1-th segments in the l-th reshaping function, it setups how the HDR prediction values are to be computed for codewords within the m-th segment and the λm+1 pivot point, e.g.:
Step 310 starts an iterative process, which for a range of values for
in [A, B], with an increment C (e.g., without limitation,
and C = 0.001.
The process in
For the first segment (e.g., m = 0), one may still apply the same iterative process, but with small variations. After initialization (see equation (15), but with m = 0), because
must be equal to the original coefficient, the iteration process in step 310 now varies
in a range [A0, B0] with step C0, where, in an embodiment (see Table 5),
and
Given
and
one may apply equations (11) and (16) to compute
and distortion D′ in steps 315 and 320.
For the last segment
values for j=0, 1, and 2 (see equation (15)), are computed using equation (12).
Given a set of L pre-computed forward reshaping functions, each one computed for a value in a set of different adaptive control signals {r(0), r(1), ⋯ r(L-1)}, for example r may denote the average luminance value, then, the corresponding forward reshaping functions for the three color planes (e.g., YCbCr) may be expressed as
Given a new control signal value (r) between two pre-computed values, that is, r(l) ≤ r < r(l+1), one would like to derive a new forward reshaping function from the existing pre-computed functions. Assuming the new reconstructed HDR samples can be interpolated from two near samples, let
then, using linear interpolation, the HDR interpolated values can be expressed as
The interpolation of luma and chroma reshaping functions is examined next.
From equations (15) and (16)
Alternatively, the function form can be expressed as
Given that all pivot points are aligned, in an embodiment, one may generate the interpolated function by directly generating an interpolated set of polynomial coefficients for each segment. For example, and without limitation, consider a multi-segment polynomial format with second-order polynomials. Assume the considered HDR range is within the m-th piece/segment, then the corresponding SDR reshaped value using the l-th and l+1-th forward reshaping functions can be expressed as:
After the polynomial interpolation, denote the new polynomial coefficients as
then, the interpolated reshaped SDR value can be expressed as
The backward reshaping function,
can be built in the same way.
In an embodiment, and without loss of generality, instead of expressing reshaping functions as a multi-segment polynomial (e.g., as discussed before for the luma component), one may express reshaping using alternative schemes, such as multiple color channel, multiple regression prediction discussed in Ref. [4] and Ref.[6], where a chroma value is predicted based on a combination of both luma and chroma values.
In Ref. [6], it was shown that given r(l) ≤ r < r(l+1), MMR coefficients can be a linear combination of two nearby MMR coefficients as well, or:
where
denotes the set of MMR coefficients for the l-th reshaping function. Alternatively,
where C can be color component c0 or c1 (e.g., Cb or Cr).
Given the forward reshaping function (412), an encoder could generate the parameters of the reverse or backward reshaping function (e.g., 150) (e.g., see Ref. [5]), which could be transmitted to the decoder as was shown in
For decoding, a decoder may employ the system functionality of
Each of these references is incorporated by reference in its entirety.
1. G-M, Su et al., “Encoding and decoding reversible, production-quality single-layer video signals,” PCT Application, Ser. No. PCT/US2017/023543, filed on Mar. 22, 2017, WIPO Publication WO 2017/165494.
2. Q. Song et al., “High-fidelity full-reference and high-efficiency reduced reference encoding in end-to-end single-layer backward compatible encoding pipeline,” PCT Application, Ser. No. PCT/US2019/031620, filed on May 9, 2019, WIPO Publications, WO 2019/217751.
3. B. Wen et al., “Inverse luma/chroma mappings with histogram transfer and approximation,” U.S. Pat. 10,264,287, issued on Apr. 16, 2019.
4. G-M. Su et al., “Multiple color channel multiple regression predictor,” U.S. Pat. 8,811,490.
5. A. Kheradmand et al., “Block-based content-adaptive reshaping for high-dynamic range,” U.S. Pat. 10,032,262.
6. H. Kadu et al., “Interpolation of reshaping functions, “ PCT Application, Ser. No. PCT/US2019/063796, filed on Nov. 27, 2019.
Embodiments of the present invention may be implemented with a computer system, systems configured in electronic circuitry and components, an integrated circuit (IC) device such as a microcontroller, a field programmable gate array (FPGA), or another configurable or programmable logic device (PLD), a discrete time or digital signal processor (DSP), an application specific IC (ASIC), and/or apparatus that includes one or more of such systems, devices or components. The computer and/or IC may perform, control or execute instructions relating to the generation of reshaping functions, such as those described herein. The computer and/or IC may compute, any of a variety of parameters or values that relate to the generation of reshaping functions as described herein. The image and video dynamic range extension embodiments may be implemented in hardware, software, firmware and various combinations thereof.
Certain implementations of the invention comprise computer processors which execute software instructions which cause the processors to perform a method of the invention. For example, one or more processors in a display, an encoder, a set top box, a transcoder or the like may implement methods for the generation of reshaping functions as described above by executing software instructions in a program memory accessible to the processors. The invention may also be provided in the form of a program product. The program product may comprise any non-transitory and tangible medium which carries a set of computer-readable signals comprising instructions which, when executed by a data processor, cause the data processor to execute a method of the invention. Program products according to the invention may be in any of a wide variety of non-transitory and tangible forms. The program product may comprise, for example, physical media such as magnetic data storage media including floppy diskettes, hard disk drives, optical data storage media including CD ROMs, DVDs, electronic data storage media including ROMs, flash RAM, or the like. The computer-readable signals on the program product may optionally be compressed or encrypted.
Where a component (e.g. a software module, processor, assembly, device, circuit, etc.) is referred to above, unless otherwise indicated, reference to that component (including a reference to a “means”) should be interpreted as including as equivalents of that component any component which performs the function of the described component (e.g., that is functionally equivalent), including components which are not structurally equivalent to the disclosed structure which performs the function in the illustrated example embodiments of the invention.
Example embodiments that relate to the generation of reshaping functions for HDR images are thus described. In the foregoing specification, embodiments of the present invention have been described with reference to numerous specific details that may vary from implementation to implementation. Thus, the sole and exclusive indicator of what is the invention and what is intended by the applicants to be the invention, is the set of claims that issue from this application, in the specific form in which such claims issue, including any subsequent correction. Any definitions expressly set forth herein for terms contained in such claims shall govern the meaning of such terms as used in the claims. Hence, no limitation, element, property, feature, advantage or attribute that is not expressly recited in a claim should limit the scope of such claim in any way. The specification and drawings are, accordingly, to be regarded in an illustrative rather than a restrictive sense.
Number | Date | Country | Kind |
---|---|---|---|
20170567.0 | Apr 2020 | EP | regional |
This application claims priority to U.S. Provisional Application No. 63/013,063 and European Patent Application No. 20170567.0, both filed on Apr. 21, 2020, each of which is incorporated by reference in its entirety.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/US2021/028237 | 4/20/2021 | WO |
Number | Date | Country | |
---|---|---|---|
63013063 | Apr 2020 | US |