Neutral color preservation for single-layer backward compatible codec

Information

  • Patent Grant
  • 12316864
  • Patent Number
    12,316,864
  • Date Filed
    Wednesday, May 17, 2023
    2 years ago
  • Date Issued
    Tuesday, May 27, 2025
    a month ago
Abstract
Novel methods and systems for processing a single-layer backward compatible codec with multiple-channel multiple regression coefficients either provided in or pointed to in metadata such that the coefficients have been biased to prevent a shift in neutral colors. Pseudo neutral color patches are used along with a saturation weighting factor to bias the coefficients.
Description
TECHNICAL FIELD

The present disclosure relates generally to images. More particularly, an embodiment of the present invention relates to preserving the neutral color in single-layer backward compatible codecs.


BACKGROUND

As used herein, the term ‘dynamic range’ (DR) may relate to a capability of the human visual system (HVS) to perceive a range of intensity (e.g., luminance, luma) in an image, e.g., from darkest grays (blacks) to brightest whites (highlights). In this sense, DR relates to a ‘scene-referred’ intensity. DR may also relate to the ability of a display device to adequately or approximately render an intensity range of a particular breadth. In this sense, DR relates to a ‘display-referred’ intensity. Unless a particular sense is explicitly specified to have particular significance at any point in the description herein, it should be inferred that the term may be used in either sense, e.g., interchangeably.


As used herein, the term ‘dynamic range’ (DR) may relate to a capability of the human visual system (HVS) to perceive a range of intensity (e.g., luminance, luma) in an image, e.g., from darkest grays (blacks) to brightest whites (highlights). In this sense, DR relates to a ‘scene-referred’ intensity. DR may also relate to the ability of a display device to adequately or approximately render an intensity range of a particular breadth. In this sense, DR relates to a ‘display-referred’ intensity. Unless a particular sense is explicitly specified to have particular significance at any point in the description herein, it should be inferred that the term may be used in either sense, e.g., interchangeably.


As used herein, the term high dynamic range (HDR) relates to a DR breadth that spans the 14-15 orders of magnitude of the human visual system (HVS). In practice, the DR over which a human may simultaneously perceive an extensive breadth in intensity range may be somewhat truncated, in relation to HDR. As used herein, the terms visual dynamic range (VDR) or enhanced dynamic range (EDR) may individually or interchangeably relate to the DR that is perceivable within a scene or image by a human visual system (HVS) that includes eye movements, allowing for some light adaptation changes across the scene or image. As used herein, VDR may relate to a DR that spans 5 to 6 orders of magnitude. Thus, while perhaps somewhat narrower in relation to true scene referred HDR, VDR or EDR nonetheless represents a wide DR breadth and may also be referred to as HDR.


In practice, images comprise one or more color components (e.g., luma Y and chroma Cb and Cr) wherein each color component is represented by a precision of n-bits per pixel (e.g., n=8). For example, using gamma luminance coding, images where n≤8 (e.g., color 24-bit JPEG images) are considered images of standard dynamic range, while images where n≥10 may be considered images of enhanced dynamic range. HDR images may also be stored and distributed using high-precision (e.g., 16-bit) floating-point formats, such as the OpenEXR™ file format developed by Industrial Light and Magic™.


Most consumer desktop displays currently support luminance of 200 to 300 cd/m2 or nits. Most consumer HDTVs range from 300 to 500 nits with new models reaching 1,000 nits (cd/m2). Such conventional displays thus typify a lower dynamic range (LDR), also referred to as a standard dynamic range (SDR), in relation to HDR. As the availability of HDR content grows due to advances in both capture equipment (e.g., cameras) and HDR displays (e.g., the PRM-4200™ professional reference monitor from Dolby Laboratories™), HDR content may be color graded and displayed on HDR displays that support higher dynamic ranges (e.g., from 1,000 nits to 5,000 nits or more).


The term “PQ” as used herein refers to perceptual luminance amplitude quantization. The human visual system responds to increasing light levels in a very nonlinear way. A human's ability to see a stimulus is affected by the luminance of that stimulus, the size of the stimulus, the spatial frequencies making up the stimulus, and the luminance level that the eyes have adapted to at the particular moment one is viewing the stimulus. In some embodiments, a perceptual quantizer function maps linear input gray levels to output gray levels that better match the contrast sensitivity thresholds in the human visual system. An example PQ mapping function is described in SMPTE ST 2084:2014 “High Dynamic Range EOTF of Mastering Reference Displays” (hereinafter “SMPTE”), which is incorporated herein by reference in its entirety, where given a fixed stimulus size, for every luminance level (e.g., the stimulus level, etc.), a minimum visible contrast step at that luminance level is selected according to the most sensitive adaptation level and the most sensitive spatial frequency (according to HVS models).


As used herein, multiple-channel multiple regression (MMR) refers to methods that allow an encoder to approximate/predict a higher dynamic range image (e.g., HDR) in terms of a given lower dynamic range image (e.g., SDR) and an MMR model. Examples are provided in U.S. Pat. No. 8,811,490 “Multiple Color Channel Multiple Regression Predictor” by Guan-Ming Su et al., incorporated by reference in its entirety herein.


As used herein, single-layer backward compatible (SLBC) refers to a single-layer encoding system that supports both a higher and lower dynamic range display. This can be accomplished by the use of MMR coefficients provided in metadata, either directly (e.g., coefficients as data) or indirectly (e.g., a pointer to one of several pre-generated MMR models). An example is provided in U.S. Pat. No. 11,277,627 “High-Fidelity Full Reference And High-Efficiency Reduced Reference Encoding In End-To-End Single-Layer Backward Compatible Encoding Pipeline” by Qing Song et al., incorporated by reference in its entirety herein.


As used herein, “d3DMT” refers to a dynamic 3D mapping table. The d3DMT is built from the HDR image and the SDR image for forward reshaping chroma codewords in the HDR image to reshaped chroma codewords in the reshaped SDR image to achieve a relatively high (e.g., the highest, etc.) fidelity of perceived color. An example is shown in U.S. Pat. No. 11,277,627 as referenced above.


WO 2021/076822 A1 discloses a method for encoding forward reshaped image data in a video signal. A backward reshaping mapping table is initially generated as an inverse of a forward reshaping mapping table. The backward reshaping mapping table is updated by replacing the content-mapped luminance codewords with forward reshaped luminance codewords generated by applying a luminance forward mapping to the sampled luminance codewords. The luminance forward mapping is constructed from the forward reshaping mapping table. The backward reshaping mapping table and the luminance forward mapping are used to generate backward reshaping mappings for creating a reconstructed image from a forward reshaped image. The forward reshaped image is encoded, in a video signal, along with image metadata specifying the backward reshaping mappings. A recipient device of the video signal applies the backward reshaping mappings to the forward reshaped image to create the reconstructed image of the second dynamic range.


The approaches described in this section are approaches that could be pursued, but not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section. Similarly, issues identified with respect to one or more approaches should not assume to have been recognized in any prior art on the basis of this section, unless otherwise indicated.


SUMMARY

The invention is defined by the independent claims. The dependent claims concern optional features of some embodiments. An embodiment for achieving neutral color preservation comprises a method of building a dynamic 3D mapping table (d3DMT) from a first reference image and a second reference image, the first reference image having a higher dynamic range than the second reference image; computing content color matrices based on the d3DMT; computing a content saturation from the d3DMT; computing at least one weighting factor from the content saturation; building a neutral color set; computing neutral color matrices based on the neutral color set; combining the content color matrices, the at least one weighting factor, and the neutral color matrices to solve multiple-channel multiple regression (MMR) coefficients; and providing metadata containing data related to the MMR coefficients to a decoder to allow backward compatibility for a single-layer bitstream.


The method can be programmed/built into an encoder as a codec.


The systems and methods are not limited to the above embodiments and further details and embodiments are provided in the description and drawings provided herein.





BRIEF DESCRIPTION OF DRAWINGS


FIG. 1 illustrates an example of the prior art SLBC codec without neutral color preservation.



FIG. 2 illustrates an example of an SLBC codec with neutral color preservation.



FIG. 3 illustrates an example chart of color distortion (average) vs. chroma saturation (d3DMT) for Cb and Cr channels.



FIG. 4 illustrates an example saturation vs. minimum weighting factor for Cb and Cr channels.



FIG. 5 illustrates an example of a clipping function derived from FIG. 4.



FIG. 6 illustrates example graphs of neutral color area MAD with optimal weighting factor in Cb and Cr.



FIG. 7 illustrates an example graphs of whole image area MAD with optimal weighting factor in Cb and Cr.





DETAILED DESCRIPTION

As used herein, “neutral color” refers to the center (0.5 on a nonnalized range from 0 to 1) of all color axes (e.g., Cb=0.5 and Cr=0.5 in the normalized YCbCr domain).


An SLBC algorithm consists of two paths, forward path which maps input higher dynamic range images (e.g., herein HDR) to a lower dynamic range image (e.g., herein SDR), and backward path which maps SDR back to HDR. FIG. 1 shows an example SLBC based on using d3DMT to compute MMR coefficients to be used in image prediction for the SLBC codec. A reference HDR image (110) and a corresponding reference SDR image (120) are used to build a dynamic 3d mapping table (130). From this a matrix (described herein as A) and a vector (described herein as b) are computed (140) which solve the MMR coefficients (described herein as m) (150).


Building the d3DMT


Denote the ith pixel of the reference HDR image frame as (viy,vicb,vicr) for 3 color channel (y, cb, cr), the corresponding reference SDR image as (siy,sicb,sicr) (frame index i is dropped for now, but will be added back in solving the MMR coefficients). The pixel values are normalized to [0 1] and there are P pixels in one image. The forward path (HDR to SDR) is used as an example here, but the backward path (SDR to HDR) can be derived in the same process.


Denote the number of bins Qy, Qcb, Qcr for each component. Compute the (Qy×Qcb×Qcr) 3D histogram. Let the 3D histogram bin in HDR denoted by ΩQ,v, where Q=[Qy, Qccb, Qcr]. Thus, ΩqQ,v contains total (Q=Qy·QCb·QCr) bins such that each 3D bin specified by bin index q=(qy, qCb, qCr), represents the number of pixels having those 3-channel quantized values.


For each channel, compute the minimum (Lv,ck) and maximum (Hv,ch) for the HDR image signal as:










L

v
,
ch


=


min
i

(

v
i
ch

)





equation



(
1
)
















H

v
,
ch


=


max
i

(

v
i

ch



)





equation



(
2
)








Each channel is uniformly quantized into Qch bins based on the min and max. The range of bin j is:

[Lv,ch+j·gch,Lv,ch+(j+1)·gch),j∈{0,1, . . . ,Qch−1}.  equation (3)

where










g
ch

=




H

v
,
ch


-

L

v
,
ch




Q
ch


.





equation



(
4
)








For a given input HDR value, the bin index can be determined by:










q
ch

=





v
i
ch

-

L

v
,
ch




g
ch








equation



(
5
)








Compute the HDR sums in each HDR 3D bin as ΨYQ,v ΨCbQ,v and ΨCrQ,v. Also compute the SDR sums in each HDR 3D bin. Let ΨYQ,s ΨCbQ,s and ΨCrQ,s be the mapped SDR luma and chroma values such that each bin of these contains the sum of all SDR color channel pixel values where the corresponding HDR pixel value lies in that bin.


Here is an example of this operation:


STEP 1: 3D SDR histogram and 3D-mapped HDR chroma values initialization:

    • ΩqQ,v=0 where q=(qy, qCb, qCr) and q=0, . . . , Q−1
    • ΨY,qQ,v=0 where q=(qy, qCb, qCr) and q=0, . . . , Q−1
    • ΨCb,qQ,v9=0 where q=(qy, qCb, qCr) and q=0, . . . , Q−1
    • ΨCr,qQ,v9=0 where q=(qy, qCb, qCr) and q=0, . . . , Q−1
    • ΨCb,qQ,s=0 where q=(qy, qCb, qCr) and q=0, . . . , Q−1
    • ΨCb,qQ,s=0 where q=(qy, qCb, qCr) and q=0, . . . , Q−1
    • ΨCr,qQ,s=0 where q=(qy, qCb, qCr) and q=0, . . . , Q−1


STEP 2: scan for each pixel in HDR and SDR-chroma from each (i) of the P color patches:








q
y

=





v
i
y

-

L

v
,
y




g
y





;





// HDR downsampled-luma quantized value








q
Cb

=





v
i
cb

-

L

v
,
cb




g
cb





;





// HDR chroma 0 quantized value








q
Cr

=





v
i
cr

-

L

v
,
cr




g
cr





;





// HDR chroma 1 quantized value

    • q=(qy,qCb,qCr),
    • ΩqQ,y++; // 3D HDR histogram
    • ΨY,qQ,vY,qQ,v+viy; // HDR Y values
    • ΨCb,qQ,vCb,qQ,v+vicb; // HDR Cb values
    • ΨCr,qQ,v=ΨΨCr,qQ,v+vicr; // HDR Cr values
    • ΨY,qQ,sY,qQ,s+siy; // SDR Y values
    • ΨCb,qQ,sCb,qQ,s+sicb; // SDR Cb values
    • ΨCr,qQ,sCr,qQ,s+sicr; // SDR Cr values


Now, find the 3D HDR histogram bins that have non-zero number of pixels. In other words, discard all those bins that do not have any pixels. Let q0, q1, . . . , qK-1. be K such bins for which, ΩqQ,v≠0. Compute the averages for HDR (ΨYQ,v, ΨCbQ,v, ΨCrQ,v) and SDR (ΨYQ,s, ΨCbQ,s, ΨCrQ,s) as shown below:


Set for each bin the SDR mapping values (the non-zero bin index qi=(qy, qCb, qCr)), for each (i) of the K bins:









Ψ
_


Y
,

q
i



Q
,
v


=


Ψ

Y
,

q
i



Q
,
v



Ω

q
i


Q
,
v




;





// Average HDR Y values









Ψ
_


Cb
,

q
i



Q
,
v


=


Ψ

Cb
,

q
i



Q
,
v



Ω

q
i


Q
,
v




;





// Average HDR Cb values









Ψ
_


Cr
,

q
i



Q
,
v


=


Ψ

Cr
,

q
i



Q
,
v



Ω

q
i


Q
,
v




;





// Average HDR Cr values









Ψ
_


Y
,

q
i



Q
,
s


=


Ψ

Y
,

q
i



Q
,
s



Ω

q
i


Q
,
v




;





// Average SDR Y values









Ψ
_


Cb
,

q
i



Q
,
s


=


Ψ

Cb
,

q
i



Q
,
s



Ω

q
i


Q
,
v




;





// Average SDR Cb values









Ψ
_


Cr
,

q
i



Q
,
s


=


Ψ

Cr
,

q
i



Q
,
s



Ω

q
i


Q
,
v




;





// Average SDR Cr values


All mapping pairs







(



Ψ
_


Y
,

q
i



Q
,
v


,


Ψ
_


Cb
,

q
i



Q
,
v


,


Ψ
_


Cr
,

q
i



Q
,
v



)



(



Ψ
_


Y
,

q
i



Q
,
s


,


Ψ
_


Cb
,

q
i



Q
,
s


,


Ψ
_


Cr
,

q
i



Q
,
s



)






can be collected to construct the d3DMT.


Construction of MMR Coefficients


The forward path is for HDR to SDR conversion (higher dynamic range to lower), with the backward path being from lower to higher dynamic range. Different MMR coefficients can be used for the forward and backward paths, or both paths can be given the same MMR coefficients based on the more important path (e.g., only the forward path is computed, then used for both paths, if HDR to SDR conversion is prioritized).


Forward Path


Denote the MMR expanded form for the ith HDR entry in d3DMT as







(



Ψ
_


t
,
Y
,

q
i



Q
,
v


,


Ψ
_


t
,
Cb
,

q
i



Q
,
v


,


Ψ
_


t
,
Cr
,

q
i



Q
,
v



)



(



Ψ
_


t
,
Y
,

q
i



Q
,
s


,


Ψ
_


t
,
Cb
,

q
i



Q
,
s


,


Ψ
_


t
,
Cr
,

q
i



Q
,
s



)






at frame t as:











v
_


t
,
i


=

[



1




Ψ
_


t
,
Y
,

q
i



Q
,
v






Ψ
_


t
,
Cb
,

q
i



Q
,
v







Ψ
_


t
,
Cr
,

q
i



Q
,
v








Ψ
_


t
,
Y
,

q
i



Q
,
v





Ψ
_


t
,
Cb
,

q
i



Q
,
v











(



Ψ
_


t
,
Y
,

q
i



Q
,
v





Ψ
_


t
,
Cb
,

q
i



Q
,
v



)

2








(


Ψ
_


t
,
Cb
,

q
i



Q
,
v


)

2







(


Ψ
_


t
,
Cr
,

q
i



Q
,
v


)

2




]









equation



(
6
)








All non-zero KtF entries can be collected together as:










V
t

(
F
)


=

[





v
_


t
,
0








v
_


t
,
1













v
_


t
,


K
t
F

-
1






]





equation



(
7
)








The observation chroma signal (e.g., ch can be Cb channel or Cr channel) can be expressed as:










s
t


c

h

,

(
F
)



=

[





Ψ
_


t
,
ch
,

q
0



Q
,
s








Ψ
_


t
,
ch
,

q
1



Q
,
s













Ψ
_


t
,
ch
,

q


K
t
F

-
1




Q
,
s





]





equation



(
8
)








Denote the chth channel of MMR coefficients as mtch,(F). The predicted SDR signal is

ŝdch,(F)=vt(F)mtch,(F)  equation (9)


The optimal solution can be obtained via least squared solution:










m
t

ch
,

(
F
)

,
opt


=


arg

min






s
ˆ

t

ch
,

(
F
)



-

s
t

ch
,

(
F
)







=

arg

min






V
t

(
F
)




m
t

ch
,

(
F
)




-

s
t

ch
,

(
F
)












equation



(
10
)









which can be solved as:

mtch,(F),opt=((Vt(F))T(Vt(F))−1((Vt(F))Tstch,(F))  equation (11)


To simply the discussion, further denote the following notations:

At(F),CC=(Vt(F))T(Vt(F)) and btch,(F),CC=(Vt(F))Tstch,(F)  equation (12)

giving the forward coefficients as:

mtch,(F),opt=(At(F),CC)−1(btch,(F),CC).  equation (13)

Backward Path


The MMR coefficients for the backward path can be computed via essentially the same process as the forward path. With the predicted SDR image and having constructed the d3DMT from a predicted SDR






(



Ψ
~


t
,
Y
,

q
i



Q
,
s


,


Ψ
~


t
,
Cb
,

q
i



Q
,
s


,


Ψ
~


t
,
Cr
,

q
i



Q
,
s



)





to original HDR image






(



Ψ
~


t
,
Y
,

q
i



Q
,
v


,


Ψ
~


t
,
Cb
,

q
i



Q
,
v


,


Ψ
~


t
,
Cr
,

q
i



Q
,
v



)





at frame t. The MMR expanded form for the ith entry can be expressed as:











s
_


t
,
i


=

[



1




Ψ
~


t
,
Y
,

q
i



Q
,
s






Ψ
~


t
,
Cb
,

q
i



Q
,
s







Ψ
~


t
,
Cr
,

q
i



Q
,
s








Ψ
~


t
,
Y
,

q
i



Q
,
s





Ψ
~


t
,
Cb
,

q
i



Q
,
s











(



Ψ
~


t
,
Y
,

q
i



Q
,
s





Ψ
~


t
,
Cb
,

q
i



Q
,
s



)

2








(


Ψ
~


t
,
Cb
,

q
i



Q
,
s


)

2







(


Ψ
~


t
,
Cr
,

q
i



Q
,
s


)

2




]









equation



(
14
)








Collecting all non-zero KtB entries together as:











S
^

t

(
B
)


=

[





s
_


t
,
0








s
_


t
,
1













s
_


t
,


K
t
B

-
1






]





equation



(
15
)








The observation chroma signal (e.g., ch can be Cb channel or Cr channel) can be expressed as:










v
t


c

h

,

(
B
)



=

[





Ψ
_


t
,
ch
,

q
0



Q
,
v








Ψ
_


t
,
ch
,

q
1



Q
,
v













Ψ
_


t
,
ch
,

q


K
t
B

-
1




Q
,
v





]





equation



(
16
)








Denote the chth channel of MMR coefficients as mtch(B). The predicted HDR signal is

{circumflex over (v)}tch,(B)t(B)mtch,(B)  equation (17)


The optimal solution can be obtained via least squared solution










m
t

ch
,

(
B
)

,
opt


=


arg

min






v
ˆ

t

ch
,

(
B
)



-

v
t

ch
,

(
B
)







=

arg

min







S
^

t

(
B
)




m
t

ch
,

(
B
)




-

v
t

ch
,

(
B
)












equation



(
18
)








This can be solved using a least squared solution algorithm.

mtch,(B),opt=((Ŝt(B))T(Ŝt(B)))−1((Ŝt(B))Tvtch,(B))  equation (19)


To simply the discussion, we further denote the following notations:

At(B),CC=(Ŝt(B))T(Ŝt(B)) and btch,(B),CC=(Ŝt(B))Tvtch,(B)  equation (20)
mtch,(B),opt=(At(B),CC)−1(btch,(B),CC)  equation (21)


The processes above produce MMR coefficients that can be biased to whatever the majority color is in the reference image, depending on the color distribution. This can cause a tinting in the neutral color, meaning that the neutral color is not accurately preserved. Generally, the more saturated an image is, the greater the chance in a neutral color shift. This can be corrected by introducing a set of hypothetical neutral color patches into the MMR optimization procedure to provide an efficient solution to alleviate the neutral color shift issue.


Neutral Color Preservation



FIG. 2 shows an embodiment of an improved SLBC algorithm that addresses the neutral color shift issue. As with the previous process, an HDR image (210) and an SDR image (220) are used to create a d3DMT (230). However, two extra elements are added: 1) computing a saturation of the d3DMT (233) and computing a matrix of weighing values w (235) from the saturation; and 2) building a neutral color set from neutral color patches added to the image (243) and computing additional A and b matrices from the patches (245). The additional matrices (w, ANC, bNC) are then combined with the usual content color matrices (ACC and bCC) (250) to solve for MMR coefficients (260) that, as a result of the above, have been biased to preserve neutral colors.


In other words, the left side of the flowchart (210, 220, 230, 233, 235, 240) shows the “nature image” path (for content color, or “CC”). By a given HDR-SDR nature image pair (210, 220), first build the d3DMT (230). Based on the forward d3DMT, compute At(F),CC and, btch,(F),CC (240). Also re-use the d3DMT (230) to compute the content saturation βt(F) (233). The required weighting factor can be found via function ƒ(βt(F)) (235). Based on the backward d3DMT, compute At(B),CC an bt(B),CC (240); and compute the content saturation βt(B) (233).


The right side of flowchart (243, 245) shows the (added) neutral color (“NC”) path. Build the At(F),NC btch,(F),NC (245) as described herein. The backward path can be computed in the same process as At(B),NC and btch,(B),NC


In some embodiments, the system or method includes the neutral color path without weighting factors. In some embodiments, as shown in FIG. 2, the system or method includes both the nature image path the neutral color path with weighting factors.


Then, combine (250) the nature image and neutral color patch using the following equations for forward path and backward path individually:

At(F)=At(F),CC+W(F)·At(F),NC  equation 22)
btch,(F)=btch,(F),CC+w(F)·btch,(F),NC  equation (23)
At(B)=At(B),CC+w(B)·At(B),NC  equation (24)
btch,(B)=btch,(B),CC+w(B)·btch,(B),NC  equation (25)


The forward and backward MMR coefficients can be solved:

mtch,(F),opt=(At(F))−1(btch,(F))  equation (26)
mtch,(B),opt=(At(B))−1(btch,(B))  equation (27)


To bias the MMR coefficient optimization more toward neutral color preservation, modify the matrix, At(F), btch,(F), At(B), and btch,(B). The modification (bias) starts from building N pseudo neutral color patches which do not occur in the content (images).


Forward Path Bias


For the forward path, assign the ith color patch, i=0, . . . , N−1, as:

vt,iy,NC=i·1/N  equation (28)
vt,icb,NC=0.5  equation (29)
vt,icr,NC=0.5  equation (30)
and
st,icb,NC=0.5  equation (31)
st,icr,NC=0.5  equation (32)


The MMR expanded form for the ith HDR neutral color patch as











v
¯


t
,
i


N

C


=


[



1



v

t
,
i


y
,
NC





v

t
,
i


cb
,
NC





v

t
,
i


cr
,
NC









v

t
,
i


y
,
NC




v

t
,
i


cb
,
NC










(


v

t
,
i


y
,
NC




v

t
,
i


cb
,
NC



)

2








(

v

t
,
i


cb
,
NC


)

2





(

v

t
,
i


cr
,
NC


)

2






]








equation



(
33
)








All N color patches can be collected together as:










V
t


(
F
)

,
NC


=

[





v
¯


t
,
0


N

C








v
¯


t
,
1


N

C













v
¯


t
,

N
-
1



N

C





]





equation



(
34
)








The observation chroma signal (e.g., ch can be Cb channel or Cr channel) can be expressed as:










s
t


c

h

,

(
F
)

,
NC


=

[



0.5




0.5









0.5



]





equation



(
35
)








Given the following matrixes:

At(F),NC=(Vt(F),NC)T(Vt(F),NC) and btch,(F),NC=(Vt(F),NC)Tstch,(F),NC  equation (36)

the final A matrix and b vector will be linear combined via weighting factor w(F).

At(F)=At(F),CC+w(F)·At(F),NC  equation (37)
btch,(F)=btch,(F),CC+w(F)·btch,(F),NC  equation (38)


The MMR coefficients for the forward path can be solved as follows:

mtch,(F),opt=(At(F))−1(btch,(F))  equation (39)

Backward Path Bias


For the backward path, assign the ith color patch, i=0, . . . , N−1, as

ŝt,iy,NC=i·1/N  equation (40)
ŝt,icb,NC=0.5  equation (41)
ŝt,icr,NC=0.5  equation (42)
and
vt,icb,NC=0.5  equation (43)
vt,icr,NC=0.5  equation (44)


The backward path is the same process. Denote the corresponding reference SDR image as (ŝt,iyt,icbt,icr).











s
_


t
,
i


N

C


=


[



1




s
^


t
,
i


y
,
NC






s
^


t
,
i


cb
,
NC






s
^


t
,
i


cr
,
NC










s
^


t
,
i


y
,
NC





s
^


t
,
i


cb
,
NC










(



s
^


t
,
i


y
,
NC





s
^


t
,
i


cb
,
NC



)

2








(


s
^


t
,
i


cb
,
NC


)

2





(


s
^


t
,
i


cr
,
NC


)

2






]








equation



(
45
)








All N color patches can be collected together as:











S
^

t


(
B
)

,
NC


=

[





s
_


t
,
0


N

C








s
_


t
,
1


N

C













s
_


t
,

N
-
1



N

C





]





equation



(
46
)








The observation chroma signal (e.g., ch can be Cb channel or Cr channel) can be expressed as










v
t

ch
,

(
B
)

,
NC


=

[



0.5




0.5









0.5



]





equation



(
47
)








Given the following matrixes:










A
t


(
B
)

,
NC


=




(


S
ˆ

t


(
B
)

,
NC


)

T



(


S
ˆ

t


(
B
)

,
NC


)



and



b
t

ch
,

(
B
)

,
NC



=



(


S
ˆ

t


(
B
)

,
NC


)

T



v
t

ch
,

(
B
)

,
NC








equation



(
48
)









the final A matrix and b vector will be linear combined via weighting factor w(B):

At(B)=At(B),CC+w(B)·At(B),NC  equation (49)
btch,(B)=btch,(B),CC+w(B)·btch,(B),NC  equation (50)


The MMR coefficients for the backward path can be solved as follows:

mtch,(B),opt=(At(B))−1(btch,(B))  equation (51)

Determining the Weighting Factor


As discussed in the previous sections, one needs to determine the weighting factor for the forward path (wtF)) and/or backward path (wtB)). One can use the empirical method (experimentation) to find out the optimal solution, as shown herein.


Content Saturation and Neutral Color Deviation


The content saturation plays a role in determining the weighting factor. Note that the neutral color has (Cb,Cr) value as (0.5, 0.5) in YCbCr domain. The distance between each color entry in






d

3

D


MT

(



Ψ
_


t
,
Cb
,

q
i



Q
,
v


,


Ψ
_


t
,
Cr
,

q
i



Q
,
v



)






to the closest neutral color axis (0.5, 0.5) can be measured as










c

t
,
i


(
F
)


=




(



Ψ
_


t
,
Cb
,

q
i



Q
,
v


-
0.5

)

2

+


(



Ψ
_


t
,
Cr
,

q
i



Q
,
v


-
0.5

)

2







equation



(
52
)








This is the chroma saturation for one entry. For all valid non-zero KtF cubes, the average saturation can be computed as:










β
t

(
F
)


=


1

K
t
F








i



c

t
,
i


(
F
)







equation



(
53
)








The average saturation (β(F)) of full-grid uniformly sampled RGB color points in one particular color space (such as R.709/P3/R.2020) can be determined inside a container (e.g., R.2020). A wider color gamut will have a higher average saturation.


To see the impact of different values of weighting factors, place a grey-level bar with Nh pixels in a known location in each test image. Since the location is known, measure the deviation from neutral color for that gray-level bar between the final reconstructed HDR image and neutral color point 0.5. Adopt the mean of absolute difference (MAD) for the chroma channels from the neutral color patch.


To simplify the design, set wt=wt(F)=wt(B). Denote the final reconstructed HDR pixel value as ({circumflex over (v)}t,iy,(wt),{circumflex over (v)}t,icb,(wt),{circumflex over (v)}t,icr,(wt)) when applying the weighting factor as wt. For the pixels inside the gray-level bar, the MAD can be computed as:











d
t

c

h


(

w
t

)

=


1

N
b








i





"\[LeftBracketingBar]"




v
ˆ


t
,
i


ch
,

(

w
t

)



-
0.5



"\[RightBracketingBar]"







equation



(
54
)








A larger value of dtch(wt) implies larger deviation from the neutral color and has higher potential to introduce non-neutral color.


Compute the MAD for the entire chroma image between reconstructed HDR and source HDR for applying wt (e.g., |{circumflex over (v)}t,ich,(wt)−vt,ich|) and without applying (e.g., |{circumflex over (v)}t,ich,(0)−vt,ich|) And then compute the difference between these two MADs.











D
t

c

h


(

w
t

)

=


(


1
N







i





"\[LeftBracketingBar]"




v
ˆ


t
,
i


ch
,

(

w
t

)



-

v

t
,
i


c

h





"\[RightBracketingBar]"



)

-

(


1
N







i





"\[LeftBracketingBar]"




v
ˆ


t
,
i


ch
,

(
0
)



-

v

t
,
i


c

h





"\[RightBracketingBar]"



)






equation



(
55
)








A negative value implies the chroma accuracy is actually improved with less distortion. A positive value means the chroma accuracy is degraded owing to the introducing of weighting factor.


Data Analysis—Empirical Method for Determining Weighting Factor


First consider the case without neutral color preservation, namely, setting wt=0. An example correlation between content saturation, βt(F), and distortion in the gray-level bar, dtch(0), of both channels is shown in FIG. 3. The content saturation plays a role in effecting the neutral color deviation. The higher saturation of the content (βt(F)), the higher chroma distortion (dtch(0)), giving a higher chance of showing a non-neutral color artifact. Note the image is in 12 bits precision for this example.


For each t test image, search the minimal value of weighting factor such that the neutral color deviation is smaller than a threshold δ:










w
t
opt

=

arg


min

w
t



{



d
t

c

h


(

w
t

)

<
δ

}






equation



(
56
)









FIG. 4 shows an example required wtopt for δ=8 (in 12 bits precision) vs content saturation βt(F). Note that the higher saturation a content has, the higher weighting factor is needed to preserve the neutral color. Some situations do not need neutral color preservation (i.e., wtopt=0), for example FIG. 4 where the saturation is less than 0.1. Other situations might require different weighting values, such as in FIG. 4 where a saturation value under 0.13 can use a weighting value of 5, whereas higher saturation values might require weighting values of 10 or 15.


The non-neutral color artifact is rarely shown in an image showing a scene set in nature, but it is more frequently found in a very saturated synthetic dataset, especially in certain color spaces (e.g., R.2020).


Adoptive Weighing Factor Selection Method


The plot of saturation vs. min W (e.g., FIG. 3) can be used to find a piecewise linear function to envelop/upper-bound the data dots. When measuring the content saturation, βt(F) one can determine the required weighting factor, wt. An example is listed below:










w
t

=


f

(

β
t

(
F
)


)

=

{



0




β
t

(
F
)


<
0.05






100


(


β
t

(
F
)




(

-
0.05

)







0.05


β
t

(
F
)



0.2





15




β
t

(
F
)


>
0.2










equation



(
57
)








Or it can simply be expressed as a clipping function:

wt=ƒ(βt(F))=clip3(100(βt(F)−0.05),0.05,0.2)  equation (58)



FIG. 5 shows this function (510) as a line clipped below 0.05 and above 0.2. Different profiles can produce different curves/functions.



FIG. 6 shows an example of the MAD, dtch(wt), using the piecewise linear model wt=ƒ(βt(F)) for each frame. The MAD is within the tolerance range (δ=8) in 12-bit precision. Subjective testing reveals the invisibility of this small neutral color shift.



FIG. 7 shows an example of the entire image MAD Dtch(wt) using the optimal wt=ƒ(βt(F)) for each frame. As shown in FIG. 7, most frames show negative values, which implies that the chroma is indeed improved with this neutral color preservation method. Few frames have Dtch(wt) larger than 0, but still less than (δ=8). The performance degrades in MAD, but within the invisible range.


In some embodiments herein, a number of neutral color patches (N) is multiplied by a weighting factor (W). In some embodiments, N is fixed and W can be adjusted to provide a neutral color preservations. In other embodiments, W can be fixed and the number of neutral color patches can be increased (for example, to W*N). Typically, fixing N will be less computationally intensive. Fixing W bypasses the steps of determining W, and fixing W to 1 would not require multiplying W with N in the calculations.


Hardware Embodiments


One embodiment for the system is as part of a codec used in a encoder-decoder system. The MMR coefficients can be solved in an encoder (a processor/machine configured to receive an input of images—individual and/or video—and output a data stream with encoded images and metadata) with metadata sent to a decoder that either includes the coefficients themselves or includes a pointer to one of a set of pre-determined coefficients stored in the decoder. For example, the decoder can include a library of MMR coefficient sets covering various expected bias levels, the metadata informing the decoder which set to use for compatibility.


As described herein, an embodiment of the present invention may thus relate to one or more of the example embodiments, which are enumerated below. Accordingly, the invention may be embodied in any of the forms described herein, including, but not limited to the following Enumerated Example Embodiments (EEEs) which described structure, features, and functionality of some portions of the present invention:


EEE1. A method for encoding image data with neutral color preservation, the method comprising: building a dynamic 3D mapping table (d3DMT) from a first reference image and a second reference image, the first reference image having a different dynamic range than the second reference image; computing content color matrices based on the d3DMT; computing a content saturation from the d3DMT; computing at least one weighting factor from the content saturation; building a neutral color set; computing neutral color matrices based on the neutral color set; combining the content color matrices, the at least one weighting factor, and the neutral color matrices to solve multiple-channel multiple regression (MMR) coefficients; and providing metadata containing data related to the MMR coefficients to a decoder to allow backward compatibility for a single-layer bitstream.


EEE2. The method of EEE 1, wherein: the method is configured to provide the MMR coefficients which comprise data related to both forward path MMR coefficients and backward path MMR coefficients, wherein the forward path MMR coefficients are independently determined from the backward path MMR coefficients.


EEE3. The method of EEE 1, wherein: the method is configured to provide the MMR coefficients which comprise data related to only backward path MMR coefficients.


EEE4. The method of any of EEEs 1 to 3, wherein the content color matrices are in the form of A and b, where A is based on a vector of all non-zero bins and b is based on a vector of all non-zero bins and an observation chroma signal.


EEE5. The method of any of EEEs 1 to 4, wherein the MMR coefficients are evaluated using a least squared solution algorithm.


EEE6. The method of EEE 5, wherein the MMR coefficients are in the form of A−1b.


EEE7. The method of any of EEEs 1 to 6, wherein the combining includes multiplying one of the neutral color matrices with one of the at least one weighting factor and adding that product to a corresponding one of the content color matrices.


EEE8. The method of any of EEEs 1 to 7, wherein the building the neutral color set comprises building a plurality of pseudo neutral color patches which do not occur in either the first reference image or the second reference image.


EEE9. The method of any of EEEs 1 to 8, wherein the at least one weighting factor comprises a forward path weighting factor and a backward path weighting factor.


EEE10. The method of any of EEEs 1 to 9, wherein the metadata includes a pointer to a set of MMR coefficients stored in the decoder.


EEE11. The method of any of EEEs 1 to 10, wherein computing at least one weighting factor includes comparing the content saturation to experimental data.


EEE12. The method of EEE 11, further comprising setting a linear function to envelope data points in the experimental data.


EEE13. An encoder, configured to perform the method of any one of EEEs 1 to 12, said encoder comprising: a processor; a signal input configured to receive the image; and a signal output configured to send the metadata.


EEE14. A method for encoding image data with neutral color preservation, the method comprising: building a dynamic 3D mapping table (d3DMT) from a first reference image and a second reference image, the first reference image having a different dynamic range than the second reference image; computing content color matrices based on the d3DMT; building a neutral color set; computing neutral color matrices based on the neutral color set; combining the content color matrices and the neutral color matrices to solve multiple-channel multiple regression (MMR) coefficients; and providing metadata containing data related to the MMR coefficients to a decoder to allow backward compatibility for a single-layer bitstream.


Equivalents, Extensions, Alternatives and Miscellaneous

Example embodiments that relate to color transformations under a coding-efficiency constraint for the coding of HDR video are thus described. In the foregoing specification, embodiments of the present invention have been described with reference to numerous specific details that may vary from implementation to implementation. Thus, the sole and exclusive indicator of what is the invention, and what is intended by the applicants to be the invention, is the set of claims that issue from this application, in the specific form in which such claims issue, including any subsequent correction. Any definitions expressly set forth herein for terms contained in such claims shall govern the meaning of such terms as used in the claims. Hence, no limitation, element, property, feature, advantage or attribute that is not expressly recited in a claim should limit the scope of such claim in any way. The specification and drawings are, accordingly, to be regarded in an illustrative rather than a restrictive sense.

Claims
  • 1. A method for encoding image data with neutral color preservation, the method comprising: building a dynamic 3D mapping table (d3DMT) from a first reference image and a second reference image, the first reference image having a different dynamic range than the second reference image;computing content color matrices based on the d3DMT;computing a content saturation from the d3DMT;computing at least one weighting factor from the content saturation;building a neutral color set comprising a plurality of neutral color image patches;computing neutral color matrices based on the neutral color set;combining the content color matrices, the at least one weighting factor, and the neutral color matrices to solve multiple-channel multiple regression (MMR) coefficients; andproviding metadata containing data related to the MMR coefficients to a decoder to allow backward compatibility for a single-layer bitstream.
  • 2. The method of claim 1, wherein the building a dynamic 3D mapping table (d3DMT) from a first reference image and a second reference image comprises determining 3D histogram bins having a non-zero number of pixels.
  • 3. The method of claim 1, wherein the content color matrices are in the form of ACC and bCC, where ACC is based on a vector of all non-zero bins and bCC is based on a vector of all non-zero bins and an observation chroma signal.
  • 4. The method of claim 1, wherein building the neutral color set comprises generating a neutral color patch vector that collects a plurality of neutral color patches, wherein in a neutral color patch-luminance (Y) values are in [0, 1) and chroma values (Cb, Cr) are fixed at 0.5.
  • 5. The method of claim 1, wherein the neutral color matrices are in the form of ANC and bNC, where ANC is based on the neutral color patch vector and bNC is based on the neutral color patch vector and an observation chroma signal of neutral color represented by the center of all color axes.
  • 6. The method of claim 1, wherein the combining includes multiplying one of the neutral color matrices with one of the at least one weighting factor and adding that product to a corresponding one of the content color matrices.
  • 7. The method of claim 1, wherein: the method is configured to provide the MMR coefficients which comprise data related to both forward path MMR coefficients and backward path MMR coefficients, wherein the forward path MMR coefficients are independently determined from the backward path MMR coefficients.
  • 8. The method of claim 1, wherein: the method is configured to provide the MMR coefficients which comprise data related to only backward path MMR coefficients.
  • 9. The method of claim 1, wherein the MMR coefficients are determined using a least squared solution algorithm.
  • 10. The method of claim 9, wherein the MMR coefficients are in the form of A−1b.
  • 11. The method of claim 1, wherein the at least one weighting factor comprises a forward path weighting factor and a backward path weighting factor.
  • 12. The method of claim 1, wherein the metadata includes a pointer to a set of MMR coefficients stored in the decoder.
  • 13. The method of claim 1, wherein computing at least one weighting factor includes comparing the content saturation to experimental data.
  • 14. The method of claim 13, wherein the computing at least one weighting factor from the content saturation comprises finding a piecewise linear function to envelope data points in the experimental data, and wherein the piecewise linear function maps the measured content saturation to the at least one weighting factor.
  • 15. An encoder, configured to perform the method claim 1, said encoder comprising: a processor;a signal input configured to receive the image; anda signal output configured to send the metadata.
Priority Claims (1)
Number Date Country Kind
22175096 May 2022 EP regional
CROSS-REFERENCE TO RELATED APPLICATIONS

This application is a U.S. National Stage application under U.S.C. 371 of International Application No. PCT/US2023/022581, filed on May 17, 2023, which claims the benefit of priority to European patent application 22 175 096.1 (reference: D22021EP), and U.S. Provisional patent application Ser. No. 63/345,161 (reference: D22021USP1), both filed on 24 May 2022, each of which is incorporated by reference in its entirety.

PCT Information
Filing Document Filing Date Country Kind
PCT/US2023/022581 5/17/2023 WO
Publishing Document Publishing Date Country Kind
WO2023/229898 11/30/2023 WO A
US Referenced Citations (5)
Number Name Date Kind
8811490 Su Aug 2014 B2
11277627 Song Mar 2022 B2
20180242006 Kerofsky Aug 2018 A1
20210195221 Song Jun 2021 A1
20230171436 Song Jun 2023 A1
Foreign Referenced Citations (1)
Number Date Country
2021076822 Apr 2021 WO
Non-Patent Literature Citations (1)
Entry
SMPTE ST 2084:2014 “High Dynamic Range EOTF of Mastering Reference Displays” Aug. 16, 2014, pp. 1-15, 15 pages.
Provisional Applications (1)
Number Date Country
63345161 May 2022 US