FACE REGION DETECTION AND LOCAL RESHAPING ENHANCEMENT

Information

  • Patent Application
  • 20240428612
  • Publication Number
    20240428612
  • Date Filed
    July 25, 2022
    2 years ago
  • Date Published
    December 26, 2024
    23 days ago
Abstract
Methods and corresponding systems to process face regions are disclosed. The described methods include providing face bounding boxes and confidence levels for the faces, generating a histogram of the pixels and the faces, generating a probability of face, and generating a face probability map. A face contrast adjustment and a face saturation adjustment can be applied to the face probability map.
Description
TECHNICAL FIELD

The present disclosure relates in general to video image processing. In particular, this disclosure relates to face region detection and local reshaping enhancement.


BACKGROUND

Face detection methods have been used in various applications that identify human faces in images and/or videos. In some of the existing face region detection methods, the face region can be detected by skin tone. Some methods based on graph-cut or graphical models may use the bounding boxes of faces to predict segmentation of faces in images. Based on recently developed techniques, deep convolutional neural networks for semantic and instance segmentation tasks can be used for face region detection.


SUMMARY

The disclosed methods and devices provide an efficient framework to detect face region in images given bounding boxes of faces and apply different adjustment on the face region in local reshaping. The detection of face region is based on histogram analysis of the face and can be efficiently extended to continuous frames in video clips. When applying the detected face region to local reshaping, the contrast and saturation of faces can be adjusted separately from other image contents to avoid over-enhancement of details, such as wrinkles or spots, on faces.


An embodiment of the present invention is a method of face region detection in an input image including one or more faces, the method comprising: providing face bounding boxes and confidence levels for each face of the one or more faces; based on the input image, generating a histogram of all pixels; based on the input image and the face bounding boxes, generating histograms of the one or more faces; based on the histogram of all pixels and the histograms of the one more face, generating a probability of face, and based on the probability of face, generating a face probability map. Another embodiment of the present invention utilizes the face region detection of the previous embodiment to apply local reshaping by applying face saturation adjustment and face contrast adjustment to the face probability map to generate an adjusted face probability map; and generating a reshaped image based on the adjusted face probability map and one or more selected reshaping function.


A method may be computer-implemented in some embodiments. For example, the method may be implemented, at least in part, via a control system comprising one or more processors and one or more non-transitory storage media.


Some or all of the methods described herein may be performed by one or more devices according to instructions (e.g. software) stored on one or more non-transitory media. Such non-transitory media may include memory devices such as those described herein, including but not limited to random access memory (RAM) devices, read-only memory (ROM) devices, etc. Accordingly, various innovative aspects of the subject matter described in this disclosure may be implemented in a non-transitory medium having software stored thereon. The software may, for example, be executable by one or more components of a control system such as those disclosed herein. The software may, for example, include instructions for performing one or more of the methods disclosed herein.


At least some aspects of the present disclosure may be implemented via an apparatus or apparatuses. For example, one or more devices may be configured for performing, at least in part, the methods disclosed herein. In some implementations, an apparatus may include an interface system and a control system. The interface system may include one or more network interfaces, one or more interfaces between the control system and memory system, one or more interfaces between the control system and another device and/or one or more external device interfaces. The control system may include at least one of a general-purpose single- or multi-chip processor, a digital signal processor (DSP), an application specific integrated circuit (ASIC), a field programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic, or discrete hardware components. Accordingly, in some implementations the control system may include one or more processors and one or more non-transitory storage media operatively coupled to one or more processors.


Details of one or more implementations of the subject matter described in this specification are set forth in the accompanying drawings and the description below. Other features, aspects, and advantages will become apparent from the description, the drawings, and the claims. Note that the relative dimensions of the following figures may not be drawn to scale. Like reference numbers and designations in the various drawings generally indicate like elements, but different reference numbers do not necessarily designate different elements between different drawings.





BRIEF DESCRIPTION OF DRAWINGS


FIG. 1 shows an exemplary diagram of the face region detection and local reshaping with face adjustment according to an embodiment of the present disclosure.



FIG. 2 shows an exemplary diagram of the face region detection process according to an embodiment of the present disclosure.



FIG. 3 shows an exemplary diagram of generating global generic histograms according to an embodiment of the present disclosure.



FIG. 4 shows an image with detected faces according to an embodiment of the present disclosure.



FIG. 5 shows an exemplary diagram of generating individual histograms of faces in an image according to an embodiment of the present disclosure.



FIG. 6 shows an exemplary diagram of calculating the initial probability of face according to an embodiment of the present disclosure.



FIG. 7 shows an exemplary diagram of the adaptive sorting and probability propagation process according to an embodiment of the present disclosure.



FIGS. 8A to 8D show example graphs and histograms related to the present disclosure. FIG. 8A shows an exemplary graph of probability of face and FIG. 8B shows an exemplary histogram of non-face according to an embodiment of the present disclosure. FIG. 8C shows an exemplary histogram of truly non-face and FIG. 8D shows an exemplary graph of updated probability of non-face according to an embodiment of the present disclosure.



FIG. 9 shows an exemplary diagram illustrating the details of the local post processing step according to an embodiment of the present disclosure.



FIG. 10 shows an exemplary diagram of the local reshaping according to an embodiment of the present disclosure.





DETAILED DESCRIPTION

The previous methods of facial recognition for image processing have drawbacks for video. For example, skin tone detection cannot be generalized well, because skin tone varies between different people and different lighting conditions. Predicting segmentation is computationally expensive for video. And neural networks can create flickering artifacts in further operations due to missing detections and temporal inconsistency. The systems and methods provided herein avoid those deficiencies.


As used herein, “face bounding box” refers to an imaginary (non-drawn) rectangle that serves as a point of reference for a face detected by a face detection algorithm.


As used herein, “histogram of a face” refers to grouped data for a detected face image.


As used herein, “face probability map” refers to a pixel mapping of an image to the probabilities of each pixel individually being part of a face.


As used herein, “basic face shape” or “basic face shape model” refers to a shape (e.g. an ellipse) that represents generally the size and shape of a detected face and a “basic face shape map” refers to a pixel mapping of basic face shapes in an image.


As used herein, “probability of face” and “probability of non-face” refer to the calculated probability of a pixel being in a face or not in a face respectively.


As used herein, “soft morphological operation” refers to non-linear operations related to the shape or morphology of features in an image where the maximum and the minimum operations used in standard gray-scale morphology are replaced by weighted order statistics.


As used herein, “face adjustment” refers to applying reshaping operations on the detected face regions of an image.


As shown in the exemplary embodiment of FIG. 1, the disclosed method comprises face region detection (100). Given an input image (11) and pre-detected face bounding boxes (10), the histogram (12) properties are analyzed and the probability (13) of face is predicted for each bin in the histogram. Local post-processing (14) is then applied, and as a result, smoothness is improved and small noise in the generated face probability map (15) is removed.


Local reshaping (100′) processing can then be applied. With the face probability map (15), different local reshaping (17) operations on face region are applied. The contrast and saturation in face region are adjusted (16) so that it looks natural and visually pleasant in the reshaped image (18). In an embodiment, local reshaping methods like those proposed in U.S. Prov. App. Ser. 63/086,699 filed by the applicant of the present disclosure, for “Adaptive Local Reshaping For SDR-To-HDR Up-Conversion” filed on Oct. 2, 2020 and incorporated herein by reference in its entirety, can be used. In this method, the contrast and saturation for each pixel can be easily adjusted.


With continued reference to FIG. 1, the disclosed methods may also be integrated with existing linear encoding architecture for local reshaping (see e.g. U.S. Prov. App. Ser. 63/086,699 mentioned above) to address the real-world conversion scenario. The presented methods take advantage of the sliding window in linear encoding architecture to enhance the temporal stability in final video quality


A. Face Region Detection


FIG. 2 shows an exemplary diagram of the face region detection process according to an embodiment of the present disclosure. Given that the colors of faces are most likely different from the colors of other contents in the same image, such process is based on analyzing the histograms of face in YUV color space. Given an input image (201) and pre-detected bounding boxes of faces (200), as shown in histogram analysis step (220), the generic histograms (202) of face and all pixels and the individual histograms (204) of each detected face in YUV color space are first calculated. A basic face shape model (203) is used to generate the basic shape map, which is an initial guess of face region in the input image (201) for calculating the histograms. As part of step (230), the initial probability (205) of face in YUV color space is calculated from the generic histograms (202). Adaptive sorting (206) may then be used to refine the initial probability of face based on the individual histograms (204) of each face and the generic histograms (202). The probability of face then is iteratively updated and propagated (207) in YUV color space.


With further reference to FIG. 2, as shown in local post-processing step (240), given the refined probability of face, local smoothing (208) is first performed to avoid artifact due to abrupt probability changes in image. This is followed by iterative application of soft morphological operation (209) to remove small noise from the final face probability map (215) of the input image (201). The pre-detected bounding boxes (200) of faces may be from any kinds of face detectors that predict the bounding boxes and corresponding detection scores of faces. In what follows, details of various steps shown in the embodiment of FIG. 2 will be described.


A.1 Histogram Analysis

According to the teachings of the present disclosure, as part of the histogram analysis, a face shape model is used to generate the initial guess of face region for calculating the generic histogram of face. In order to capture the diversity of colors in different faces in the same image, the individual histogram of each face is also calculated.


A.1.1 Global Generic Histograms


FIG. 3 shows an exemplary diagram of generating global generic histograms according to an embodiment of the present disclosure. The generic histograms refer to the histogram of all faces or the histogram of all pixels. To calculate the generic histogram of face in the input image (31), the region of face is defined first. Based on the already detected bounding boxes (30) of faces, each of them may be filled with the average shape of face, i.e. basic face shape (32), to get an initial guess of face region. Given an input image (31), S, of size W×H and Nface detected faces with bounding boxes (ck,xk,yk,wk,hk), k=0, . . . , Nface−1, where ck is the detection score between 0 and 1, (xk,yk) is the coordinate (either in integer or floating point) of the top-left corner of the bounding box, and (wk,hk) is the size (either in integer or floating point) of the bounding box of the k-th detected face, a basic shape map (33), MQ, can be generated. Such basic shape map is the initial guess of face region, using the pre-defined or pre-trained basic face shape (32) model, denoted as Q. The basic face shape (32) model is the probability map of face inside the detected bounding box. It can also be viewed as the average shape of face. As an example, a basic face shape (32) model, Q, can be a solid inscribed ellipse of the bounding box, i.e. 1 inside the ellipse and 0 outside the ellipse. As another example, the basic face shape (32) model can also be learned from training data of segmentation of faces. In general, a basic face shape (32) model can be saved as a probability map of size WQ×HQ and resized for each detected face.


With further reference to FIG. 3, for the k-th detected face, the face shape model may be resized and shifted to fit it into the bounding box (30) to obtain the probability map of face MQ,k. Then the probability map is multiplied by the detection score ck to reduce the effect from false positive detections, which usually have lower detection scores. Then, the probability maps of all detected faces are added to the basic shape map (33), MQ. The maximum value of MQ may be clipped to 1 in case there are overlapping bounding boxes. If there are any letterboxes in MQ, such letterboxed are excluded. Given the probability map of non-active region (padded black areas with arbitrary shapes, such as letterboxes, pillar boxes, circles, or any other shapes) ML obtained from a non-active region detector such as the one described in U.S. Prov. App. Ser. 63/209,602 filed by the applicant of the present disclosure, for “Surround Area Detection And Blending For Image Filtering” filed on Jun. 11, 2021 and incorporated herein by reference in its entirety, the probability map of region of interest (ROI) can be defined as MROI=1−ML and then multiplied by MQ. Therefore, the final MQ can be formulated as:










M
Q

=



M
ROI

.

*

(

min

(




k



c
k



M

Q
,
k




,
1

)

)






(
1
)







where operator .* is element-wise multiplication. In order to further clarify the above-disclosed teachings, reference is made to FIG. 4 showing an image (400) wherein four faces have been detected. Image (400) includes image main area (401) and letterbox (402). A basic shape map (403) and a face bounding box (404) associated with the four faces are also shown.


Referring back to FIG. 3, as the bounding boxes (30) from face detector may not always be perfect, the actual face region may be outside the bounding boxes (30). Therefore, the probability of face outside the bounding box may not be 0. In this case, the center and scale of the bounding box may be fixed with scaling factor ƒbox,x and ƒbox,y for x and y directions before fitting the basic face shape model. The pseudocode below shows an example of how the basic shape map is generated from basic face shape model of inscribed ellipse:














Generate the basic shape map from basic face shape


model of inscribed ellipse


Input: detected face bounding boxes (ck, xk, yk, wk, hk),


k = 0, ... , Nface − 1,


probability map of ROI MROI, scaling factors fbox,x, fbox,y


Output: basic shape map MQ


// Initialization


for (i = 0; i < H; i + +) {


 for(j = 0; j < W; j + +) {


  MQ(i, j) = 0


 }


}


// Add probability of each detected face to map


for (k = 0; k < Nface; k + +) {


 // Skip invalid detection


 if (wk == 0 or hk == 0 or ck == 0) {


  continue


 }


 // Center and half of the width and height of the rectangle to fit


 cx = xk + wk/2


 cy = yk + hk/2


 sx = fbox,x * wk/2


 sy = fbox,y * hk/2


 // Location and size of the rectangle to fit


 xf = cx − Sx


 yf = cy − Sy


 wf = 2sx


 hf = 2sy


 // Valid pixel range


 xbegin = max(round(xf ), 0)


 xend = min(round (xf + wf ), W)


 ybegin = max(round(yf ), 0)


 yend = min(round (yf + hf ), H)


 // Fill in basic face shape map


 for (i = ybegin; i < yend; i + +) {


  for (j = xbegin; j < xend; j + +) {


   // Solid ellipse


   if ((i − cy)2/(sy)2 + (j − cx)2/(sx)2 ≤ 1) {


    MQ(i, j) = MQ(i, j) + ck


   }


  }


 }


}


// Clip maximum to 1


MQ = min(MQ, 1)


// Apply ROI


MQ = MQ.* MROI


return MQ


Generate the basic shape map from basic face shape


model of arbitrary shape


Input: detected face bounding boxes


(ck, xk, yk, wk, hk), k = 0, ... , Nface − 1,


probability map of ROI MROI, basic face shape model Q, scaling


factors fbox,x, fbox,y


// Initialization


for (i = 0; i < H; i + +) {


 for (j = 0; j < W; j + +) {


  MQ(i, j) = 0


 }


}


// Add probability of each detected face to map


for (k = 0; k < Nface; k + +) {


 // Skip invalid detection


 if (wk == 0 or hk == 0 or ck == 0) {


  continue


 }


 // Center and half of the width and height of the rectangle to fit


 cx = xk + wk/2


 cy = yk + hk/2


 sx = fbox,x * wk/2


 sy = fbox,y * hk/2


 // Location and size of the rectangle to fit


 xf = cx − sx


 yf = cy − sy


 wf = 2sx


 hf = 2sy


 // Valid pixel range


 xbegin = max(round(xf), 0)


 xend = min(round(xf + wf), W)


 ybegin = max(round(yf), 0)


 yend = min(round (yf + hf ), H)


 // Fill in basic face shape map


 for (i = ybegin; i < yend; i + +) {


  for (j = xbegin; j < xend; j + +) {


   // Coordinate in basic face shape model


   im = clip3(round((i − yf) * HQ/hf), 0, HQ − 1)


   jm = clip3(round((j − xf) * WQ/wf), 0, WQ − 1)


   // Add to basic shape map


   MQ(i, j) = MQ(i, j) + ck * Q(im, jm)


  }


 }


}


// Clip maximum to 1


MQ = min(MQ, 1)


// Apply ROI


MQ = MQ .* MROI


return MQ









With continued reference to FIG. 3, given the face region defined in the basic shape map (32), the generic histograms of face (35) and generic histograms of all pixels (34) can be calculated. According to an embodiment of the present disclosure, the generic histogram of face (35) is calculated as weighted counting of pixels, where the weight is from the basic shape map (32). On the other hand, the generic histogram of all pixel (34) is a histogram counting for all pixels in ROI. For computational efficiency, the pixel may be subsampled during counting with a subsample factor shist. As an example, shist may be set as shist=2. The histograms of the input image (31) of size H×W may be calculated in YUV color space. The YUV channels of the input image are denoted as SY, SU and SV, respectively, and the number of bins for each channel as NbinY, NbinU and NbinV, respectively. For the input bit depth BS, the bin width for each channel is calculated as wbinY=2BS/NbinY, wbinU=2BS/NbinU and wbinV=2BS/NbinV. Exemplary values for BS, NbinY, NbinU and NbinV, are BS=10 and NbinY=NbinU=NbinV=128. For different YUV input formats, the corresponding pixel locations in each channel may be needed. For YUV420 input, the Y channel may be saved as a W×H array and the U and V channels may be saved as Whalf×Hhalf arrays, where Whalf=W/2 and Hhalf=H/2. Therefore, ShalfU and ShalfV are used to represent the down-sampled U and V channels, respectively. For computation efficiency, the pixel location (i,j) in SY may be matched to (┌i/2┘,┌j/2┘) in ShalfU and ShalfV. For other YUV format, adjustment may be made accordingly. The following pseudocode is an example of the generic histograms of face and all pixels for YUV420 input is calculated:














// Calculate the generic histograms of face and all pixels for YUV420 input


Input: YUV channels of input image SY, ShalfU and ShalfV, basic


shape map MQ, probability map of ROI MROI, subsample factor Shist


Output: generic histogram of face histface, generic


histogram of all pixels histall


// Initialization


histface = zeros (NbinY, NbinU, NbinV)


// Weighted count


for (i = 0; i < H; i+= shist) {


 for (j = 0; j < W; j+= shist) {


  ihalf = └i/2┘ // convert index for YUV420 input


  jhalf = └j/2┘ // convert index for YUV420 input


  bY = └SY(i, j)/wbinY


  bU = └ShalfU(ihalf, jhalf)/wbinU


  bV = └ShalfV(ihalf, jhalf)/wbinV


  histface(bY, bU, bV) = histface(bY, bU, bV) + MQ(i, j)


  histall(bY, bU, bV) = histall(bY, bU, bV) + MROI(i, j)


 }


}


return histface, histall









A.1.2 Local Individual Histogram of Face

In addition to the global generic histogram of all faces, the local individual histogram of each face is also considered to capture the variation of each face. This is illustrated by an exemplary diagram shown in FIG. 5. For each face, the basic face shape (52) model is used to find the probability inside the face bounding box (50) on the basis of an input image (51) in the same manner as in constructing the basic shape map (33) of FIG. 3, and this is followed by performing a weighted counting. However, storing all the individual histograms (54) may take a huge amount of memory if there are many faces in one frame, and the situation may become more severe if the histograms from multiple frames are stored. Therefore, to save memory, the individual histograms (54) of each face are preliminarily trimmed (53) while keeping as much pixel counts as possible. In what follows, an exemplary process of trimming is described more in detail.


With further reference to FIG. 5, for the k-th face, given the original histogram histface,k, the trimmed histogram custom-characterface,k is a subarray of size ÑbinY×ÑbinU×ÑbinV starting at bins (bstart,kY, bstart,kU, bstart,kV). This is shown in the following equation:











face
,
k


=


hist

face
,
k


(







b

start
,
k

Y

:


b

start
,
k

Y


+


N
~

bin
Y


,



b

start
,
k

U

:


b

start
,
k

U


+









N
~

bin
U

,



b

start
,
k

V

:


b

start
,
k

V


+


N
~

bin
V






)





(
2
)







In addition, the keeping ratio rkeep,k of the trimmed histogram, i.e. the ratio of total pixel count before and after trimming, may be recorded for future use. Such ratio can be obtained as follows:










r

keep
,
k


=


sum
(


face
,
k


)

/

sum
(

hist

face
,
k


)






(
3
)







For an improved result, in order to trim the histogram, the contiguous bins of size ÑbinY×ÑbinU×ÑbinV inside which the summation of histogram is maximum may be found. However, the resulting computation may be large because the histogram is 3-D. Therefore, the histogram may be trimmed in one channel at a time in the order of Y, U and V channels. An example of parameters is ÑbinY=64, ÑbinUbinV=16 for all faces. Moreover, most of the faces may have a keeping ratio of, for example, larger than 90%.


Continuing with the trimming process disclosed above, and in view of possible memory limitations, the maximum number of faces, Nface,max, may be set for storing individual histograms. As such, when Nface>Nface,max, the Nface, max most important faces are only kept. Because larger faces in image usually attract more attention, the size of the bounding boxes may be used as a measure of importance. Additionally, the detection score of bounding boxes may be considered to avoid false detections. Therefore, the importance of each face may be defined based on their area and detection score as shown in the following equation:










a
k

=


c
k
2

*
min



(




w
k

*

h
k



W
*

H
/

N

face
,
max





,
1

)






(
4
)







where the area is normalized by W*H/Nface,max and clipped to 1 because if a face is large enough, it is deemed as important. The term Nface,max is put in the denominator because the more faces can be kept, the smaller faces can be considered. The top Nface, max faces with the highest importance are selected. An exemplary value for Nface,max, is Nface,max=16.


With reference to FIG. 5 the following pseudocode shows an example of how the individual histograms (54) of each face using basic face shape (52) model of inscribed ellipse for YUV420 input is calculated:















  
// Calculate the individual histograms of each face using basic face shape model of



inscribed ellipse for YUV420 input



Input: YUV channels of input image SY, ShalfU and ShalfV, detected face bounding boxes



(ck, xk, yk, wk, hk), k = 0, ... , Nface − 1, probability map of ROI MROI, scaling factors



fbox,x, fbox,y



Output: trimmed individual histograms custom-characterface,k, starting bins by bstart,kY, bstart,kU,



bstart,kV, keeping ratio rkeep,k of each face, k = 0, ... , Nface − 1



for (k = 0; k < Nface; k + +) {



 // Initialize



 histface,k = zeros(NbinY, NbinU, NbinV)



custom-characterface,k = zeros(ÑbinY, ÑbinU, ÑbinV)



 rkeep,k = 0



 // Skip invalid detection



 if (wk == 0 or hk == 0 or ck == 0) {



  continue



 }



 // Center and half of the width and height of the rectangle to fit



 cx = xk + wk/2



 cy = yk + hk/2



 sx = fbox,x * wk/2



 sy = fbox,y * hk/2



 // Location and size of the rectangle to fit



 xf = cx − sx



 yf = cy − sy



 wf = 2sx



 hf = 2sy



 // Valid pixel range



 xbegin = max(round(xf), 0)



 xend = min(round (xf + wf), W)



 ybegin = max(round(yf), 0)



 yend = min(round(yf + hf), H)



 // Weighted count



 for (i = ybegin; i < yend; i + +) {



  for (j = xbegin; j < xend; j + +) {



   // Solid ellipse



   if ((i − cy)2/(sy)2 + (j − cx)2/(sx)2 < 1) {



    ihalf = └i/2┘ // convert index for YUV420 input



    jhalf = └j/2┘ // convert index for YUV420 input



    bY = └SY (i, j)/wbinY



    bhalfU = └SU(ihalf, jhalf)/wbinU



    bhalfV = └SV(ihalf, jhalf )/wbinV



    histface,k(bY, bU, bV) = histface,k(bY, bU, bV) + ck * MROI(i, j)



  }



 }



}



// Trim histogram



custom-characterface,k, bstart,kY, bstart,kU, bstart,kV, rkeep,k = trim_histogram(histface,k)



}



return custom-characterface,k, bstart,kY, bstart,kU, bstart,kV, rkeep,k k = 0, ... , Nface − 1



Calculate the individual histogram of face using basic face shape model of arbitrary



shape for YUV420 input



Input: YUV channels of input image SY, ShalfU and ShalfV, detected face bounding boxes



(ck, xk, yk, wk, hk), k = 0, ..., Nface − 1, probability map of ROI MROI, basic face



shape model Q, scaling factors fbox,x, fbox,y



Output: trimmed individual histograms custom-characterface,k, starting bins bstart,kY, bstart,kU,



bstart,kV, keeping ratio rkeep,k of each face, k = 0, ... , Nface − 1



for (k = 0; k < Nface; k + +) {



 // Initialize



 histface,k = zeros(NbinY, NbinU, NbinV)



custom-characterface,k = zeros(ÑbinY, ÑbinU, ÑbinV)



 rkeep,k = 0



 // Skip invalid detection



 if (wk == 0 or hk == 0 or sk == 0) {



  continue



 }



 // Center and half of the width and height of the rectangle to fit



 cx = xk + wk/2



 sx = fbox,x * wk/2



 sy = fbox,y * hk/2



 // Location and size of the rectangle to fit



 xf = cx − sx



 yf = cy − sy



 wf = 2sx



 hf = 2sy



 // Valid pixel range



 xbegin = max(round(xf), 0)



 xend = min(round(xf + wf ), W)



 ybegin = max(round(yf), 0)



 yend = min(round (yf + hf), H)



 // Fill in basic face shape map



 for (i = ybegin; i < yend; i + +) {



    ihalf = └i/2┘ // convert index for YUV420 input



    jhalf = └j/2┘ // convert index for YUV420 input



    bY = └SY(i, j)/wbinY



    bU = └ShalfU(ihalf, jhalf)/wbinU



    bV = └ShalfV(ihalf, jhalf)/wbinV



    // Coordinate in basic face shape model



    im = clip3((i − yf) * HQ/hf), 0, HQ − 1)



    jm = clip3((j − xf) * WQ/wf), 0, WQ − 1)



    histface,k(bY, bU, bV) = histface,k(bY, bU, bV) + ck * Q(im, jm) * MROI(i, j)



  }



 }



 // Trim histogram



custom-characterface,k, bstart,kY, bstart,kU, bstart,kV, rkeep,k, = trim_histoggram(histface,k)



}



 return custom-characterface,k, bstart,kY, bstart,kU, bstart,kV, rkeep,k, k = 0, ... , Nface − 1



// Trim histogram



Input: histogram hist, number of bins in trimmed histogram ÑbinY, ÑbinU, ÑbinV



Output: trimmed histogram custom-character



// Trim Y channel



histY = sum(hist, axis = [1,2]) // summation along U and V axes



bstartY = arg max sum ( histY (b: b + ÑbinY))



rkeepY = sum (bhistY (bstartY: bstartY + ÑbinY))/sum(histY)



// Trim U channel



histU = sum(hist(bstartY: bstartY + ÑbinY,:,:), axis = [0,2]) // summation along Y



and V axes



bstartU = arg max sum (histU (b: b + ÑbinU)



rkeepU = sum (bhistU(bstartU: bstartU + ÑbinU))/sum(histU)



// Trim V channel



histV = sum(hist(bstartY: bstartY + ÑbinY, bstartU: bstartU + ÑbinU,:), axis = [0,1]) //



summation along Y and U axes



bstartV = arg max sum (histV(b: b + ÑbinV))



rkeepV = sum (bhistV (bstartV: bstartV + ÑbinV))/sum(histV)



// Final output




custom-character   = hist(bstartY: bstartY + ÑbinY, bstartU: bstartU + ÑbinU, bstartV: bstartV + ÑbinV)




rkeep = rkeepY * rkeepU * rkeepV



return custom-character  , bstartY, bstartU, bstartV, rkeep









A.2 Probability Adaptation

With the generated histograms as previously disclosed, the probability of face for each bin can be defined. Generally, if a color has higher value in a histogram of face, it is more likely to be part of the face. Therefore, the initial probability of face can be estimated directly from the generic histograms of face and all pixels. However, because the histogram of face is estimated from the basic shape map, which is just an initial guess of face region, further refining of the initial probability by adapting it to the histograms locally in YUV color space may be needed. As such, iterative adaptive sorting and probability propagation based on the individual histograms of each face and the generic histogram of non-face may be implemented. Details of initial probability estimation, adaptive sorting, and probability propagation are presented through the exemplary diagrams of FIGS. 6-8 which will be described in the following sections.


A.2.1 Initial Probability


FIG. 6 shows an exemplary diagram of calculating the initial probability of face according to an embodiment of the present disclosure. First, the ratio between the histogram of face (62) and generic histogram of all pixels (61) is calculated as follows:










r
face

=



𝒢

σ
hist


(

hist
face

)

·

/


𝒢

σ
hist


(

hist
all

)







(
5
)







where custom-character is 3-D Gaussian filtering (63) with standard deviation σhist. Operator ./ is element-wise division (64). To avoid dividing by zero, rface(b) may be set to 0 for bin b where custom-character(histall) is 0. The purpose of Gaussian filtering is to reduce the noise in histogram. Standard deviation σhist may be set to, for example, σhist=0.25 (in bin). Scaling and thresholding (65) is then applied on the ratio to get the initial probability of face (66). The larger the ratio, the larger the probability. For each bin b, the following applies:











p

face
,
init


(
b
)

=

clip

3


(



(



r
face

(
b
)

-

r
0


)

/

(


r
1

-

r
0


)


,
0
,
1

)






(
6
)







where r0 and r1 are thresholds of ratio of histogram. From the above equation, it can be noticed that when rface<r0, pface,init=0. On the other hand, when rface>r1, pface,init=1. Thresholds r0 and r1 may be set, for example, to r0=0.1 and r1=0.5. Moreover, the histogram of non-face (68) may be defined as the difference (67) between the histograms histnonface=histall−histface. As will be seen later, histogram of non-face (68) will be used in the adaptive sorting process which will detailed in the next section.


A.2.2 Adaptive Sorting


FIG. 7 shows an exemplary diagram of adaptive sorting (700), described in this section, and probability propagation (701) process, described in the next section. Most parts of the basic shape map (33) of FIG. 3 are assumed to be correct and only minor adjustment may be required. More specifically, it is assumed that at least θnonface portion of pixels counted in histnonface are truly non-face. It is also assumed at least θface portion of pixels counted in histface,k are truly face for each k. As such, the probability of face as the initial probability of face is first initialized, pface←pface,init. Additionally, the probabilities of bins with the lowest probability are updated to 0 until the cumulative pixel count reaches θnonface of the total pixel count of the histogram. In other words, the updated probability from non-face (74), pface(nf) is obtained as follows:











p
face

(
nf
)


(
b
)

=

{



0




if


b





(
nf
)









p
face



(
b
)





othe

r

w

i

s

e









(
7
)







Where custom-character is the set of bins whose probabilities are to be updated to 0. In other words:












(
nf
)


=

arg




min







"\[LeftBracketingBar]"




"\[RightBracketingBar]"







(
8
)











s
.
t
.



p
face

(
b
)






p
face

(

b


)





b






,


b








and








b







hist
nonface

(
b
)





θ
nonface








b
=
0




N
bin
Y



N
bin
U



N
bin
V


-
1





hist
nonface

(
b
)








where custom-character is the set of the bins with the lowest probability. The above-disclosed method is illustrated in FIGS. 8A-8D, and in the case of a 1-D histogram. Given probability of face (81), pface, and histogram of non-face (82), histnonface, probability of the bins custom-character, with the lowest probability is updated until the sum of pixel count for those bins reach θnonface of total pixel count in histogram. As a result, histogram of truly non-face (84), histnonface, and updated probability from non-face (83), pface(nf), are obtained.


Referring back to FIG. 7, similarly to what was disclosed with regards to updated probability from non-face (74), pface(nf), the probabilities of bins with the highest probability are updated to 1 until the cumulative pixel count reaches θface of the total pixel count of the histogram for each face. In other words, the updated probability from each face (73) is obtained as:











P


f

a

c

e

,
k


(
f
)


(
b
)

=

{



1




if


b




k

(
f
)









p
face

(
b
)



otherwise








(
9
)







where custom-character is the set of bins whose probabilities are to be updated to 1:











k

(
f
)


=

arg




min







"\[LeftBracketingBar]"




"\[RightBracketingBar]"







(
10
)











s
.
t
.







p
face

(
b
)






p
face

(

b


)





b






,


b








and








b







hist

face
,

k
<



(
b
)





θ
face








b
=
0




N
bin
Y



N
bin
U



N
bin
V


-
1





hist

face
,
k


(
b
)








The updated probability from all faces (75) can be acquired by considering the updates from all faces:











p
face

(
f
)


(
b
)

=


max
k




p

face
,
k


(
f
)


(
b
)






(
11
)







In practice, the trimmed histograms custom-characterface,k, may only be available. In addition, in such trimmed histogrammed only rkeep,k portion of pixel counts in histface,k may be kept. Therefore, the cumulative pixel count may need to reach θface/rkeep,k of the sum of custom-characterface,k instead. Moreover, when θface/rkeep,k>1, all the probability of all bins in trimmed histogram are may be set to 1. The value for parameters θnonface and θface may be empirically decided. As an example, θnonface=0.9 and θface=0.75.


The pseudocode below shows an example of how the probability from non-face can be calculated:














Update the probability from non-face


Input: Initial probability of face pface,init, histogram of non-face histnonface


Output: pface(nf)


// Initialize


pface(nf) = pface,init


// Sort


Ia = sort_index(pface, ‘ascend’) // get sort index in ascending order


// Cumulative sum


Cnonface = zeros(NbinY * NbinU * NbinV)


Cnonface (0) = histnonface (Ia(0))


for (i = 1; i < NbinY * NbinU * NbinV; i + +) {


 Cnonface(i) = Cnonface(i − 1) + histnonface (Ia(i))


}


// Update


for (i = 0; i < NbinY * NbinU * NbinV; i + +) {


 pface(nf)(Ia(i)) = 0


 if (Cnonface (i) ≥ θnonface * Cnonface (NbinY * NbinU * NbinV)) {


  break


 }


}


returnpface(nf)









The pseudocode below shows an example of how the probability from face can calculated














Update the probability from face


Input: Initial probability of face pface,init, trimmed


histograms custom-characterface,k, starting bins


bstart,kY, bstart,kU, bstart,kV, keeping ratio rkeep,k, k = 0, ... , Nface − 1


Output: pface(f)


// Initialize


pface(f) = pface,init


for (k = 0; k < Nface; k + +) {


 // Sort subarray


 {tilde over (p)}face,k = pface(bstart,kY: bstart,kY, + ÑbinY, bstart,kU: bstart,kU + ÑbinY, bstart,kV:


bstart,kV + ÑbinV)


 Ĩdk = sort_index({tilde over (p)}face,k,‘descend’) // get sort index in descending order


 // Cumulative sum


 {tilde over (C)}face,k = Zeros(ÑbinY * ÑbinU * ÑbinV)


 {tilde over (C)}face,k (0) = custom-characterface,k d,k,(0))


 for (i = 1; i < ÑbinY * ÑbinU * ÑbinV; i + +) {


  {tilde over (C)}face,k (i) = {tilde over (C)}face,k(i − 1) + custom-characterface,k d,k(i))


}


 // Update


 Id,k = untrim_index(Ĩd,k,) // convert index from subarray to complete array


 for (i = 0; i < ÑbinY * ÑbinU * ÑbinV; i + +) {


  pface(f) (Id,k(i)) = 1


  if ({tilde over (C)}face,k(i) ≥ θnonface * {tilde over (C)}face,k(NbinY * NbinU * NbinV)) {


   break


  }


 }


}


return pface(f)









A.2.3 Probability Propagation

With further reference to FIG. 7, because a bin may appear in both face and non-face regions, the updates from non-face and face are performed separately and summed together. Given the updated probability from non-face (74), pface(nf), and the updated probability from face (75), pface(nf), the updated probability (77), pface(nf), is the weighted sum of these two updated probabilities based on histogram counts, and as shown below:










p
face


=





p
face

(
f
)


.

*

hist
face


+



p
face

(
nf
)


.

*

hist
nonface




hist
all






(
12
)







To avoid division by zero, pface′ may be set to 0 at the bins where histall is 0. Moreover, because the probability is updated based on the sort index, it may undergo sharp changes between neighbor bins. As such, Gaussian filtering (78) may be performed in the 3-D bins to make the probability of face (79), pface, smooth to avoid potential artifact in later stages of processing. The standard deviation of the gaussian filter, σprop, may be set, for example, to σprop=0.25.


With continued reference to FIG. 7, in accordance with the teachings of the present disclosure, the adaptive sorting (700) and probability propagation (701) may be formed for nprobada iterations to gradually adapt the probability to the histograms locally in YUV color space. The number of iterations nprobada, may be set, for example, to nprobada=3.


A.3 Local Post-Processing

With reference to FIG. 7, the probability of face was refined in YUV color space, but the spatial relationships between pixels were not considered. According to embodiments of the present disclosure the probability of face can be further refined in the spatial domain. FIG. 9 is an exemplary diagram showing the details of local post processing step (240) as was disclosed with regards to the embodiment of FIG. 2. As shown, such post processing step comprises a local smoothing (900) to avoid visual artifact and a soft morphological operation (901) to remove small noise.


A.3.1 Local Smoothing

With further reference to FIG. 9, a combination of input image (91) and the probability of face (90), pface, is first used to obtain the initial probability map (92) of face Mface,init. The following pseudocode is an example of how the probability map (92) for YUV420 input can be acquired:



















Acquire probability map for YUV420 input




Input: YUV channels of input image SY, ShalfU and ShalfV,




probability of face pface,




probability map of ROI MROI




Output: probability map of face Mface,init




// Initialize




for (i = 0; i < H; i + +) {




 for (j = 0; j < W; j + +) {




  Mface,init(i, j) = 0




 }




}




// Find probability for all pixels




for (i = 0; i < H; i + +) {




 for (j = 0; j < W; j + +) {




  ihalf = └i/2┘ // convert index for YUV420 input




  jhalf = └j/2┘ // convert index for YUV420 input




  bY = └SY(i, j)/wbinY




  bU = └ShalfU(ihalf, jhalf)/wbinU




  bV = └ShalfV(ihalf, jhalf)/wbinV




  Mface,init(i,j) = pface(bY, bU, bV)




 }




}




// Apply ROI




Mface,init ← Mface,init.* MROI




return Mface,init










Referring back to FIG. 9, because the probability of face (90) is quantized into bins, if there are very few bins, the probability between bins may be interpolated for each pixel. However, because the probability of face (90) does not yet contain spatial information, there could be sharp changes between neighbor pixels in the initial probability map (92). If this occurs in the smooth region of input image (91), there will be false edges and banding-like artifact in the following local reshaping operation. To make the probability map smooth in the regions where the input image is smooth, guided image filtering (93), as described in [Ref. 1], incorporated herein by reference in its entirety, using the probability map (92) as the input and the normalized Y-channel of input image {tilde over (S)}Y as the guidance may be implemented. The implementation detail can be found, e.g., in the above-mentioned U.S. Prov. App. Ser. 63/086,699, incorporated herein by reference in its entirety. As a result of guided image filtering (93), a smooth map (94) is obtained. Exemplary parameter values that may be used for the guided image filter (93) are degree of smoothness 0.01 and kernel size 51 for the normalized input image (91) in the range of [0,1] and size 1920×1080. For images with different sizes, the kernel size may be scaled proportional to the image size:










M
face

=



M
ROI

.

*
clip

3


(


GIF

(


M

face
,
init


,


S
~

Y


)

,
0
,
1

)






(
214
)







The output of guided image filter (93) may be clipped between [0,1] because the guided image filter (93) is based on ridge regression and may create noise due to outliers. Also, the probability map of ROI may be applied so that the face region is inside ROI, i.e. Mface(i)≤MROI(i)∀i.


A.3.2 Soft Morphological Operation

Referring back to FIG. 9, as the face regions usually are continuous and have smooth boundaries, removing the small noises in the probability map (92) may be required. The small noise can be small holes or small unconnected dots in the probability map (92). Traditionally, the small noise can be removed through morphological operations such as closing and opening. However, such operations will also alter the boundary of face region, which is undesirable in some applications. In accordance with the teachings of the present disclosure, a soft morphological operation (901) can be used to remove such kind of small noise.


The soft morphological operation (901) of FIG. 9 essentially means weighting the importance of each pixel by its surroundings. Given the input probability map (92), Mface, the soft morphological operation (901) is defined as:











(

M
face

)


=

clip

3


(



a
morph

*


M
face

.

*

(

M
face

)


,
0
,
1

)






(
15
)







Parameters to control the soft morphological operation (901) include σmorph, the standard deviation for Gaussian filtering (95), and amorph, the scaling factor to decide whether to expand the face region or not. Operator .* is elementwise multiplication. From the above definition, it can be seen that each pixel is multiplied by the weighted average of its surrounding pixels custom-character(Mface). As part of scaling and thresholding (97) step, for a pixel at which Mface>0, if custom-character(Mface)>1/amorph, the pixel value will be amplified after the operation. On the other hand, if custom-character(Mface)<1/amorph, the pixel value will be decreased after the operation. In other words, the pixel is preserved only if its surroundings have high values. Additionally, the operation may be repeated for nsoftmorph iterations to gradually refine the probability map (92), as shown in the following:










M
face





M
ROI

.

*

(

M
face

)






(
16
)







where custom-character(.) means repeating custom-character(.) for nsoftmorph times. Also, the probability map of ROI may be applied so that the face region is inside ROI, i.e. Mface(i)≤MROI(i)∀i. Parameters σmorph, amorph, and nsoftmorph may be set as, for example, σmorph=25, amorph=3 and nsoftmorph=2.


B. Local Reshaping with Face Adjustment


When local reshaping is performed, different reshaping functions can be applied on different pixels locally. The reshaping functions can control and enhance the image properties such as contrast, saturation, or other visual features, see e.g. the above-mentioned U.S. Prov. App. Ser. 63/086,699, incorporated herein by reference in its entirety. For most of the image contents, higher contrast and saturation bring better viewing experience to common people. However, for the face in images, higher contrast and saturation are not always better. People may not prefer the details, such as wrinkles or spots, on faces to be enhanced. Moreover, less saturated faces may be preferred compared with faces with over saturated skin color, which looks unnatural, i.e. changed skin tone. Local reshaping with face adjustment in accordance with the teachings of the present disclosure can be applied to address such problem. With reference to FIG. 9, after the face probability map (98) acquired based on what was disclosed previously, different reshaping functions can be applied to face regions from other non-face regions in images to adjust the contrast and saturation.



FIG. 10 shows an exemplary diagram of the local reshaping (110) according to an embodiment of the present disclosure. Based on the face probability map (102), the amount of contrast adjustment (103) for each pixel in the input image (101) is decided. In addition, the amount of saturation adjustment (104) for each pixel in the input image (101) is also decided. The adjustments for contrast and saturation are then applied to reshaping function (105) selection. The reshaping operation (106) is performed based on the selected reshaping function (105) to generate a reshaped image (107). In what follows, details of the elements of local reshaping (110) will be described.


B.1. Local Reshaping Function Selection

With further reference to FIG. 10, local reshaping method (110) may be based on local reshaping function selection as detailed in the above-mentioned U.S. Prov. App. Ser. 63/086,699, incorporated herein by reference in its entirety. In other words, for each pixel, an individual reshaping function (105) selected from a family of reshaping functions is applied for the reshaping operation (106) for each channel. Denote the input image S and its YUV channel SY, SU and SV, the reshaped image V and its YUV channel VY, VU and VV, for the i-th pixel, the local reshaping operation can be defined as:










v
i
Y

=


B

L
i
Y

Y

(

s
i
Y

)





(
17
)













v
i
U

=


MMR

L
i
U

U

(


s
i
Y

,

s
i
U

,

s
i
V


)





(
18
)













v
i
V

=


MMR

L
i
V

V

(


s
i
Y

,

s
i
U

,

s
i
V


)





(
19
)







where siY, siU, siV, viY, viU, and viV are the i-th pixel in SY, SU, SV, VY, VU and VV, respectively. B, MMRU, and MMRV are the family of reshaping functions for Y, U, and V channels, respectively, and LiY, LiU, and LiV are the corresponding indices of the selected reshaping functions for the i-th pixel. For simplicity, the indices for all pixels are denoted as index maps LY, LU and LV. Therefore, given an input image and corresponding index maps, the local reshaping operation for each pixel can be performed accordingly.


With carefully designed families of reshaping functions, the brightness, contrast, saturation, or other visual features in the reshaped images can be changed by adjusting the index maps. For example, as described, e.g. in the above-mentioned U.S. Prov. App. Ser. 63/086,699 incorporated herein by reference in its entirety, the local detail and contrast enhancement can be achieved by using:










L
Y

=


L
U

=


L
V

=


L

(
g
)


+


f
SL

(


α
.

*

(



S
˜

Y

-


S
˜


Y
,

(
l
)




)


)








(
20
)







Or equivalently










L
Y

=


L
U

=


L
V

=


L

(
g
)


+


f
SL

(

Δ


L

(
l
)



)








(
21
)








and






Δ


L

(
l
)



=


α
.

*

(



S
˜

Y

-


S
˜


Y
,

(
l
)




)






where {tilde over (S)}Y is the Y channel of normalized input image in the range of, for example, [0,1] and {tilde over (S)}Y,(l) is the corresponding edge-preserving filtered image. α is the map of enhancement strength for each pixel. The larger the α, the stronger the enhancement. ƒSL(.) is a pixelwise non-linear function to further adjust the enhancement based on pixel brightness. L(g) is a constant global index for the whole image, which control the overall look, such as brightness and saturation, of the reshaped images. Moreover, when α=0, all the pixels use the same reshaping function and this is called global reshaping, which means no local contrast and detail enhancement. As an example, 4096 reshaping functions in the family of reshaping functions can be considered for each channel. The parameter used may be a the default setting such as α=3.8*c1 for all pixels, where c1, is the model parameter and can be set as, for example, c1=2687.1.


With continued reference to FIG. 10, and in view of what was disclosed above, given the face probability map (102), by adjusting the indices in the face region in the index maps, change the look of the faces in the reshaped images (107) can be changed. In the following sections, the face contrast adjustment (103) and saturation adjustment (104) will be described more in detail.


B.2 Face Contrast Adjustment

In some applications the enhancement the details, such as wrinkles or spots, on faces like other image contents may not be desired. As such, there may be a need to reduce the enhancement strength in face region when performing detail and contrast enhancement. The adjusted index map LY may be defined as:










L
Y

=


L

(
g
)


+


f
SL

(


Δ


L

(
l
)



+

Δ


L

face
,
c




)






(
22
)








where






Δ


L

face
,
c



=


-

r
face


*

M
face

*

α
.

*

(



S
˜

Y

-


S
˜


Y
,

(
l
)




)






where rface is the face contrast reduction ratio. It can be seen that for pixel i, if Mface(i)=1, ΔLface,c(i) becomes −rface(i)*α(i)*({tilde over (S)}Y(i)−{tilde over (S)}Y,(l)(i)) and the term ΔL(l)(i)+Δlface,c(i) in Equation (22) can be written as (1−rface(i)*α(i)*({tilde over (S)}Y(i)−{tilde over (S)}Y,(l)(i)). By comparing with Equation (20) and (21), the enhancement strength drops from a (i) to (1−rface)*α(i). Therefore, ΔLface,c reduces the contrast on faces for 0<rface≤1. When rface=0, there is no adjustment. When rface=1, the enhancement strength on face becomes 0. Empirically, if the enhancement strength on a face is 0, the face may look over-smoothed compared to the surrounding image contents, which are enhanced in the original strength. As an example, rface may be set as rface=0.5.


B.2 Face Saturation Adjustment

In general, increasing the color saturation in images improves the viewing experiences. However, when it comes to the faces in images, increasing the color saturation in the same way as other image contents may be undesired. Over saturated skin color will make the faces looks unnatural or unhealthy. With reference to FIG. 10, the disclosed face saturation adjustment (104) addresses such problem.


As described in U.S. Prov. App. Ser. 63/086,699 incorporated herein by reference in its entirety, in general, the smaller the index of a reshaping function, the less saturated the reshaped image. In addition, the darker the input pixel, the more sensitive the reshaped pixel to the index.


In view of the above, based on the acquired LY as disclosed in the previous section, the adjusted index maps LU and LY can be further defined as:










L
U

=


L
V

=


L
Y

+

Δ


L

face
,
s









(
23
)








where






Δ


L

face
,
s



=


-

d
face


*


M
face

.

*

(

min

(




S
˜

Y

/

θ
sat


,
1

)

)






in Equation (23) dface is the face desaturation offset. θsat is the threshold to control the desaturation. Therefore, ΔLface,s reduces the saturation on face when dface>0 and θsat>0. The larger the dface, the more the desaturation. When dface=0, there is no desaturation. Empirically, parameters dface and θsat may be set as, for example, dface=1024 and θsat=0.5.


A number of embodiments of the disclosure have been described. Nevertheless, it will be understood that various modifications may be made without departing from the spirit and scope of the present disclosure. Accordingly, the invention may be embodied in any of the forms described herein, including, but not limited to the following Enumerated Example Embodiments (EEEs) which described structure, features, and functionality of some portions of the present invention:

    • EEE1: A method of face region detection in an input image including one or more faces, the method comprising: providing face bounding boxes and confidence levels for each face of the one or more faces; based on the input image, generating a histogram of all pixels; based on the input image and the face bounding boxes, generating histograms of the one or more faces; based on the histogram of all pixels and the histograms of the one more face, generating a probability of face, and based on the probability of face, generating a face probability map.
    • EEE2: The method of EEE1, wherein the generating the histograms of the one or more faces comprises, based on a combination of face bounding boxes with a basic face shape, generating a basic face shape map, and based on the input image and the basic face shape map, generating the histograms of the one or more faces.
    • EEE3: The method of any of EEEs 1 and 2, wherein the generating of the probability of face comprises: filtering the histogram of all pixels to generate a filtered histogram of all pixels, and filtering the histograms of the one or more faces to generated filtered histograms of the one or more faces.
    • EEE4: The method of EEE3, further wherein the generating of the probability of face further comprises, scaling and thresholding a combination of the filtered histogram of all pixels and filtered histograms of the one or more faces to generate an initial probability of face.
    • EEE5: The method of EEE4, wherein the initial probability of face comprises an initial probability of face in YUV channel.
    • EEE6: The method of any of EEEs 4 and 5, wherein the generating of the probability of face further comprises subtracting the generated histograms of the one or more faces from the generated histogram of all pixels to generate a histogram of non-face.
    • EEE7: The method of EEE6, wherein generating of the probability of face further comprises, based on the initial probability of face and the histogram of non-face, generating an updated probability of non-face, and based on the initial probability of face and the histograms of the one or more faces, generating an updated probability of face.
    • EEE8: The method of EEE7, generating of the probability of face further comprises combining the updated probability from non-face and the updated probability from face to generate an updated probability, and filtering the updated probability to generate the probability of face.
    • EEE9: The method of EEE8, wherein the filtering is performed using a gaussian filter.
    • EEE10: The method of any of EEEs 1-9, further comprising, after generating the probability of face and before generating the face probability map, local smoothing the probability of face to generate a smoothened probability of face, and applying a soft morphological operation to the smoothened probability of face to generate the face probability map.
    • EEE11: The method of EEE8, further comprising, after generating the probability of face and before generating the face probability map, local smoothing the probability of face to generate a smoothened probability of face, and applying a soft morphological operation to the smoothened probability of face to generate the face probability map.
    • EEE12: The method of any of EEEs 10 and 11 further comprising applying local reshaping by: applying face saturation adjustment and face contrast adjustment to the face probability map to generate an adjusted face probability map; and generating a reshaped image based on the adjusted face probability map and one or more selected reshaping function.
    • EEE13: The method of any of EEEs 1-12, further comprising trimming the histograms of the one or more faces to reduce a memory space required to store the histograms of the one or more faces.
    • EEE14: The method of any of EEEs 3-9, wherein: the filtering the histogram of all pixels is performed using a gaussian filter, and the filtering the histograms of the one or more faces is performed using a gaussian filter.
    • EEE15: The method of any of EEEs 4-9 wherein the combination of the filtered histogram of all pixels and filtered histograms of the one or more faces comprises a ratio of the filtered histograms of the one or more faces with the filtered histogram of all pixels.
    • EEE16: The method of EEE8, wherein the combining the updated probability from non-face and the updated probability from face comprises generating a weighted sum of updated probability from non-face and the updated probability from face.
    • EEE17: The method of EEE12, wherein the applying the face contrast adjustment is performed by adjusting a contrast of the one or more faces based on a face contrast reduction ratio.
    • EEE18: The method of EEE12, wherein the applying the face saturation adjustment is performed by adjusting a saturation of the one or more faces based a face desaturation offset and a face desaturation threshold.
    • EEE19: A video decoder comprising hardware, software, or both configured to carry out the method of any one of EEEs 1-18.
    • EEE20: A non-transient computer readable medium containing program instructions for causing a computer to perform the method of any of EEEs 1-18.


The present disclosure is directed to certain implementations for the purposes of describing some innovative aspects described herein, as well as examples of contexts in which these innovative aspects may be implemented. However, the teachings herein can be applied in various different ways. Moreover, the described embodiments may be implemented in a variety of hardware, software, firmware, etc. For example, aspects of the present application may be embodied, at least in part, in an apparatus, a system that includes more than one device, a method, a computer program product, etc. Accordingly, aspects of the present application may take the form of a hardware embodiment, a software embodiment (including firmware, resident software, microcodes, etc.) and/or an embodiment combining both software and hardware aspects. Such embodiments may be referred to herein as a “circuit,” a “module”, a “device”, an “apparatus” or “engine.” Some aspects of the present application may take the form of a computer program product embodied in one or more non-transitory media having computer readable program code embodied thereon. Such non-transitory media may, for example, include a hard disk, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or Flash memory), a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. Accordingly, the teachings of this disclosure are not intended to be limited to the implementations shown in the figures and/or described herein, but instead have wide applicability.


The examples set forth above are provided to those of ordinary skill in the art as a complete disclosure and description of how to make and use the embodiments of the disclosure, and are not intended to limit the scope of what the inventor/inventors regard as their disclosure.


Modifications of the above-described modes for carrying out the methods and systems herein disclosed that are obvious to persons of skill in the art are intended to be within the scope of the following claims. All patents and publications mentioned in the specification are indicative of the levels of skill of those skilled in the art to which the disclosure pertains. All references cited in this disclosure are incorporated by reference to the same extent as if each reference had been incorporated by reference in its entirety individually.


It is to be understood that the disclosure is not limited to particular methods or systems, which can, of course, vary. It is also to be understood that the terminology used herein is for the purpose of describing particular embodiments only, and is not intended to be limiting. As used in this specification and the appended claims, the singular forms “a,” “an,” and “the” include plural referents unless the content clearly dictates otherwise. The term “plurality” includes two or more referents unless the content clearly dictates otherwise. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which the disclosure pertains.


REFERENCES



  • [1] He, Kaiming, Jian Sun, and Xiaoou Tang. “Guided image filtering.” IEEE Transactions on Pattern Analysis and Machine Intelligence 35, no. 6 (2012): 1397-1409.


Claims
  • 1. A method of performing local reshaping on an input image including one or more faces, the method comprising: generating a histogram of all pixels in the input image;based on a combination of face bounding boxes for the one or more faces with a basic face shape model, generating a basic face shape map comprising a pixel mapping of basic face shapes for the one or more faces;based on the input image and the basic face shape map, generating histograms of the one or more faces;based on the histograms of the one more faces, generating for each bin of the histogram of all pixels a probability of face comprising a probability of a pixel being in a face,based on the probability of face, generating a face probability map comprising a pixel mapping of the input image to the probabilities of each pixel individually being part of a face, andgenerating a reshaped image from the input image based on the face probability map and one or more selected reshaping functions.
  • 2. The method of claim 1, wherein the basic face shape model comprises an inscribed ellipse of a bounding box.
  • 3. The method of any of claim 1, wherein the generating of the probability of face comprises: filtering the histogram of all pixels to generate a filtered histogram of all pixels, andfiltering the histograms of the one or more faces to generated filtered histograms of the one or more faces.
  • 4. The method of claim 3, further wherein the generating of the probability of face further comprises: scaling and thresholding a combination of the filtered histogram of all pixels and filtered histograms of the one or more faces to generate an initial probability of face.
  • 5. The method of claim 4, wherein the initial probability of face comprises an initial probability of face in YUV channel.
  • 6. The method of claim 1, wherein the generating of the probability of face further comprises: subtracting the generated histograms of the one or more faces from the generated histogram of all pixels to generate a histogram of non-face.
  • 7. The method of claim 6, wherein generating of the probability of face further comprises: based on the initial probability of face and the histogram of non-face, generating an updated probability of non-face, andbased on the initial probability of face and the histograms of the one or more faces, generating an updated probability of face.
  • 8. The method of claim 7, generating of the probability of face further comprises: combining the updated probability from non-face and the updated probability from face to generate an updated probability, andfiltering the updated probability to generate the probability of face.
  • 9. The method of claim 8, wherein the filtering is performed using a gaussian filter.
  • 10. The method of claim 1, further comprising: after generating the probability of face and before generating the face probability map, local smoothing the probability of face to generate a smoothened probability of face, andapplying a soft morphological operation to the smoothened probability of face to generate the face probability map.
  • 11. The method of claim 8, further comprising: after generating the probability of face and before generating the face probability map, local smoothing the probability of face to generate a smoothened probability of face, andapplying a soft morphological operation to the smoothened probability of face to generate the face probability map.
  • 12. The method of claim 10 further comprising: applying local reshaping by: applying face saturation adjustment and face contrast adjustment to the face probability map to generate an adjusted face probability map; andgenerating a reshaped image based on the adjusted face probability map and one or more selected reshaping function.
  • 13. The method of claim 1, further comprising: trimming the histograms of the one or more faces to reduce a memory space required to store the histograms of the one or more faces.
  • 14. The method of claim 3, wherein: the filtering the histogram of all pixels is performed using a gaussian filter, andthe filtering the histograms of the one or more faces is performed using a gaussian filter.
  • 15. The method of claim 4, wherein the combination of the filtered histogram of all pixels and filtered histograms of the one or more faces comprises a ratio of the filtered histograms of the one or more faces with the filtered histogram of all pixels.
  • 16. The method of claim 8, wherein the combining the updated probability from non-face and the updated probability from face comprises: generating a weighted sum of updated probability from non-face and the updated probability from face.
  • 17. The method of claim 12, wherein the applying the face contrast adjustment is performed by adjusting a contrast of the one or more faces based on a face contrast reduction ratio.
  • 18. The method of claim 12, wherein the applying the face saturation adjustment is performed by adjusting a saturation of the one or more faces based a face desaturation offset and a face desaturation threshold.
  • 19. A video decoder comprising hardware, software, or both configured to carry out the method of claim 1.
  • 20. (canceled)
  • 21. A non-transitory computer-readable storage medium having stored thereon computer-executable instruction for executing a method with one or more processors in accordance with claim 1.
Priority Claims (1)
Number Date Country Kind
21188517.3 Jul 2021 EP regional
CROSS-REFERENCE TO RELATED APPLICATIONS

This application claims the benefit of priority from U.S. Provisional patent application Ser. No. 63/226,938, filed on 29 Jul. 2021, and EP application Ser. No. 21/188,517.3, filed on 29 Jul. 2021, which are hereby incorporated by reference.

PCT Information
Filing Document Filing Date Country Kind
PCT/US2022/038249 7/25/2022 WO
Provisional Applications (1)
Number Date Country
63226938 Jul 2021 US