Disclosed are embodiments related to video compression and filtering.
As has previously been identified, bilateral filtering of image data directly after forming the reconstructed image block can be beneficial for video compression. As described by Wennersten et al., it is possible to reach a bit rate reduction of 0.5% with maintained visual quality for a complexity increase of 3% (encode) and 0% (decode) for random access. See P. Wennersten, J. Ström, Y. Wang, K. Andersson, R. Sjöberg, J. Enhorn, “Bilateral Filtering for Video Coding”, IEEE Visual Communications and Image Processing (VCIP), December 2017, downloadable from: jacobstrom.com slash publications slash Wennersten_et_al_VCIP2017 dot pdf (hereinafter referred to as “[1]”). However, it was shown in subsequent work that it is possible to increase the performance of the filter by averaging the argument used to calculate the weight over a small area of 3×3 samples. See Y. Chen, et al., “Description of SDR, HDR and 360° video coding technology proposal by Qualcomm and Technicolor—low and high complexity versions” Document JVET-J0021 version 5 available at phenix.int-evry.fr/jvet/doc_end_user/documents/10_San%20Diego/wg11/JVET-J0021-v5.zip (hereinafter referred to as “[2]”). This way, more than 0.5% bit rate reduction can be achieved.
Even though the gain in terms of Bjontegard Delta rate (BD rate) is improved when averaging the argument used to calculate the weight (as in [2]), this comes at a cost. In [1], the argument was calculated from two intensity values I1 and I2 as |ΔI|=I1−I2|=abs(I1−I2), where abs or |⋅| denotes taking the absolute value. We will call each such operation an absdiff operation, since it requires both an absolute value calculation and a difference calculation (subtraction). In [2], instead of just one absdiff operation, nine such values are calculated and averaged. Even though nine such operations do not cost much in terms of number of CPU clock cycles, when they are in the inner loop of a filtering operation they become expensive. Likewise, even though a set of nine such operations do not cost much in terms of silicon surface area when implemented in a full custom ASIC, when many such sets must be instantiated in order to process many samples in parallel, it can become expensive. In [1] and [2], two weights are determined per filtered pixel, resulting in 18 absdiff operations per pixel.
In previous art, which has been sent to an outside company as part of a cross-check to a standardization contribution (JVET-K0274), we managed to reduce the number of absdiff operations to three per weight, or six per filtered sample. See JVET-K0274: J. Ström, P. Wennersten, J. Enhorn, D. Liu, K. Andersson, R. Sjöberg “CE2 related: Reduced complexity bilateral filter”, JVET-K0274_v2_clean_version.docx available at phenix.int-evry.fr/jvet/doc_end_user/documents/11_Ljubljana/wg11/JVET-K0274-v3.zip (hereinafter referred to as “[3]”). This savings of a factor of three is substantial, but is not enough, since six absdiff operations in an inner loop or in parallel instantiations are still expensive.
One aspect of the present invention is to reuse already calculated absdiff values in order to avoid performing the absdiff calculation again. This way we can get the number of absdiff operations needed per weight down to just one, and the number of absdiff operations needed per sample down to just two.
By getting the number of absdiff operations down to just two per filtered sample, it is possible to implement the method from [2] with the same number of absdiff operations per filtered sample as the method from [1]. In terms of absdiff operations, therefore, the proposed method is no more complex than [1]. In terms of BD rate performance, though, it has the same improved performance as the method in [2].
According to a first aspect, a method is provided. The method includes obtaining an M×N array of pixel values an image; determining a weight selection value for position x,y in the M×N array; and using the weight selection value to obtain a weight value for use in a filter for filtering the image. Determining the weight selection value for position x,y (omegax,y) includes: a) retrieving a previously determined weight selection value for position x,y−1 (omegax,y−1); b) retrieving a previously determined alpha value (a) for position x,y−1; c) calculating a delta value (d); and d) calculating omegax,y=omegax,y−1−a+d. Calculating d includes: i) retrieving a first previously determined value (omega_row); i) retrieving a second previously determined value (alpha_row); and ii) calculating d=omega_row−alpha_row+abs(Ax+1,y+1−Ax+1,y+2), wherein Ax+1,y+1 is the value stored in position x+1,y+1 of the array and Ax+1,y+2 is the value stored in position x+1,y+2 of the array.
In some embodiments, omega_row is equal to [abs(Ax−2,y+1−Ax−2,y+2)+abs(Ax−1,y+1−Ax−2,y+2)+abs(Ax,y+1−Ax,y+2)], and alpha_row is equal to abs(Ax−2,y+1−Ax−2,y+2).
According to a second aspect, an encoder is provided. The encoder includes an obtaining unit configured to obtain an M×N array of pixel values an image; a determining unit configured to determine a weight selection value for position x,y in the M×N array; and a using unit configured to use the weight selection value to obtain a weight value for use in a filter for filtering the image. Determining the weight selection value for position x,y (omegax,y) includes: a) retrieving by a retrieving unit a previously determined weight selection value for position x,y−1 (omegax,y−1); b) retrieving by the retrieving unit a previously determined alpha value (a) for position x,y−1; c) calculating by a calculating unit a delta value (d); and d) calculating by the calculating unit omegax,y=omegax,y−1−a+d. Calculating d includes: i) retrieving by the retrieving unit a first previously determined value (omega_row); i) retrieving by the retrieving unit a second previously determined value (alpha_row); and ii) calculating by the calculating unit d=omega_row−alpha_row+abs(Ax+1,y+1−Ax+1,y+2), wherein Ax+1,y+1 is the value stored in position x+1,y+1 of the array and Ax+1,y+2 is the value stored in position x+1,y+2 of the array.
According to a third aspect, a decoder is provided. The decoder includes an obtaining unit configured to obtain an M×N array of pixel values an image; a determining unit configured to determine a weight selection value for position x,y in the M×N array; and a using unit configured to use the weight selection value to obtain a weight value for use in a filter for filtering the image. Determining the weight selection value for position x,y (omegax,y) includes: a) retrieving by a retrieving unit a previously determined weight selection value for position x,y−1 (omegax,y−1); b) retrieving by the retrieving unit a previously determined alpha value (a) for position x,y−1; c) calculating by a calculating unit a delta value (d); and d) calculating by the calculating unit omegax,y=omegax,y−1−a+d. Calculating d includes: i) retrieving by the retrieving unit a first previously determined value (omega_row); i) retrieving by the retrieving unit a second previously determined value (alpha_row); and ii) calculating by the calculating unit d=omega_row−alpha_row+abs(Ax+1,y+1−Ax+1,y+2), wherein Ax+1,y+1 is the value stored in position x+1,y+1 of the array and Ax+1,y+2 is the value stored in position x+1,y+2 of the array.
According to a fourth aspect, a computer program is provided. The computer program includes instructions which when executed by processing circuitry of a device causes the device to perform the method of any one of the embodiments of the first aspect.
According to a fifth aspect, a carrier containing the computer program of embodiments of the fourth aspect is provided. The carrier is one of an electronic signal, an optical signal, a radio signal, and a computer readable storage medium.
The accompanying drawings, which are incorporated herein and form part of the specification, illustrate various embodiments.
Throughout this description we will use filtering of intensity values as an example. This traditionally refers to the Y in YCbCr. However, it should be noted that this filtering can also be used for chroma values such as Cb and Cr, or any other components from other color spaces such as ICTCP, Lab, Y′u′v′ etc. We should also use the terms “pixel” and “sample” interchangeably.
The filter from [2] is also described in JVET-0274, which has been made publicly available in [3].
Assume that we want to filter the sample a3,3 shown below. Sample a3,3 is surrounded by other samples ai,j that form a block of samples.
If this were the filter from [1] the right weight would be calculated from deltaR, which equals deltaR=ΔIR−a3,4−a3,3, and the absolute value of that would be used for the look-up:
deltaIR=a34−a33;
influenceR=weightLUT[min(maxVal,abs(deltaIR))]*deltaIR;
As shown above, the weight calculation is performed by a look-up table (LUT). In the above calculation, note how only one absdiff operation (i.e. an operation of an absolute value taken of a difference of samples) is performed to get the weight value (influenceR) from the LUT. In contrast, following [2], we instead calculate the average absolute value by
It is easy to see that this involves nine absdiff calculations. In plain code, it could look like this:
The following matlab program calculates the sum R (which is equal to 9NLR) for every pixel at least two steps away from the edge. Note that in matlab, y-coordinates come first.
In the code accompanying the contribution JVET-K0274, it was realized that the value of R(x,y) in one point (x,y), can be calculated from the value of R in a point (x−1,y) immediately to the left of it. In the code the value R was named omega, or Ω.
In
The omega value for the light gray pixel 104 may be calculated by summing all the absdiff values represented by the nine rightmost arrows 116-132.
The software associated with JVET-K0274 denotes the sum of the values represented by the leftmost arrows 110-114 by alpha, and the sum of the values represented by the rightmost arrows 128-132 by delta. If the omega for the dark gray pixel 102 is known (omega_old), it is then possible to calculate the omega for the light gray pixel 104 (omega_new) as
omega_new=omega_old−alpha+delta.
Here the previous omega as well as the alpha value is known from previous calculation, and only the delta value (involving three absdiff calculations) is needed to be calculated. Also, rather than throwing away the delta value after using it in the equation above, it is recognized that this value will become the alpha pixel for a pixel two steps to the right. This is solved by moving the delta value to a gamma value, and before that move the gamma value to a beta value, and before that move the beta value to the alpha value. This is reflected in the C++ code from JVET-K0274.
As can be seen in the code, three absdiff operations are carried out. The omega value is then divided by nine (approximated by 114/1024 in the code) to calculate the dNLIR value that is used to obtain the weight from the look-up table.
An analogous code is used to calculate the bottomInfluence, where differences between pixels below are carried out instead of difference between pixels to the right. In this case the C++ code from JVET-K0274 looks like this:
In this case, three line buffers omegaRow, alphaRow, betaRow and gammaRow are used to store the omega, alpha, beta and gamma values. A line buffer here means an array that can hold as many values as the block is wide. The equivalent matlab code would look like this:
As an example, if the A block contains the values
The matlab code will output the following R2 values
And the following B2 values:
Note that this example only covers pixels two samples away from the border. Samples closer to the border need to be handled in a different manner, for instance by padding the block or accessing samples outside the block.
As can be seen by the matlab code or the C++ code, three absdiff operations per R-value (i.e., per weight) are calculated, and three absdiff operations per B-value. In total this amounts to six absdiff operations per filtered pixel. However, embodiments of the present invention further reduce the total absdiff operations per filtered pixel that are necessary.
As can be seen, only one value, R_delta_value, needs to be calculated per weight (the other values are stored values). Using R_delta_value, it is possible to update R_omega_row[xx], for example by the equation R_omega_row[xx]=R_omega_row[xx]−R_alpha_row[xx]+R_delta_value. By setting R_delta=R_omega_row[xx], it is then possible to update omega.
As seen in
Note that R_alpha_row, R_beta_row, R_gamma_row and R_omega_row are line buffers, i.e., they can store a full line of the block. Meanwhile R_alpha, R_beta, R_gamma, R_delta, omega but also R_delta_value are scalar values.
The full update step/calculation step is thus contained in the matlab code below:
A similar technique can also be used when calculating the difference against the pixel below, as can be seen in
Note how only one absdiff operation, namely when calculating B_delta_value, is needed to calculate the B value (also referred to as the B_omega value) for a sample weight. The value B_delta_value is used to update the B_omega_row[xx] value (for example, by B_omega_row[xx]=B_omega_row[xx]−B_alpha_row[xx]+B_delta_value), which is set to B_delta. With B_delta and B_alpha, the B_omega value can be updated as shown.
The necessary matlab code can be written as:
Combined, the following matlab code calculates both the R and B values for the interior of the block.
Using the same input A matrix as above, the matlab code will produce the matrices R3 and B3 as shown below:
Note that while this is not the same for the first two rows and the first two columns as B2/R2, the samples that are two steps away from the border are the same, and they are the ones that we want to calculate. Hence we have identical results with only two absdiff operations per sample, which is a reduction by a factor of three against JVET-K0247 and a reduction by a factor of nine against the implementation of [2].
It should be noted that the code above starts filtering from the top left sample, even though no filtered output is required for that sample. This simplifies the set-up of the necessary line buffers/arrays R_omega_row( ), R_alpha_row( ) R_beta_row( ), R_gamma_row( ) and the necessary scalar variables R_omega, R_alpha, R_beta and R_gamma. However it involves unnecessary calculations. Hence it is possible to avoid these calculations with a slightly more involved code.
In embodiments, omega_row is equal to [abs(Ax−2,y+1−Ax−2,y+2)+abs(Ax−1,y+1−Ax−2,y+2)+abs(Ax,y+1−Ax,y+2)], and alpha_row is equal to [abs(Ax−2,y+1−Ax−2,y+2)].
While various embodiments of the present disclosure are described herein, it should be understood that they have been presented by way of example only, and not limitation. Thus, the breadth and scope of the present disclosure should not be limited by any of the above-described exemplary embodiments. Moreover, any combination of the above-described elements in all possible variations thereof is encompassed by the disclosure unless otherwise indicated herein or otherwise clearly contradicted by context.
Additionally, while the processes described above and illustrated in the drawings are shown as a sequence of steps, this was done solely for the sake of illustration. Accordingly, it is contemplated that some steps may be added, some steps may be omitted, the order of the steps may be re-arranged, and some steps may be performed in parallel.
This application is a 35 U.S.C. § 371 National Stage of International Patent Application No. PCT/EP2019/068342, filed Jul. 9, 2019, designating the United States and claiming priority to U.S. provisional application No. 62/697,243, filed on Jul. 12, 2018. The above identified applications are incorporated by reference.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/EP2019/068342 | 7/9/2019 | WO | 00 |
Publishing Document | Publishing Date | Country | Kind |
---|---|---|---|
WO2020/011756 | 1/16/2020 | WO | A |
Number | Name | Date | Kind |
---|---|---|---|
20050025378 | Maurer | Feb 2005 | A1 |
20170185863 | Chandra et al. | Jun 2017 | A1 |
Number | Date | Country |
---|---|---|
2018067051 | Apr 2018 | WO |
2018117938 | Jun 2018 | WO |
Entry |
---|
International Search Report and Written Opinion issued in International Application No. PCT/EP2019/068342 dated Sep. 25, 2019 (12 pages). |
Ström, J., et al., “AHG 2 related: Reduced complexity bilateral filter,” Joint Video Experts Team (JVET) of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/WG 11, Document: JVET-K0274-v2, 11th Meeting: Ljubjiana, SI, Jul. 2018 (11 pages). |
Ström, J., et al., “JVET-K0274 Reduced complexity bilateral filter,” Ericsson Research, Jul. 2018 (31 pages). |
Wennersten, P., et al., “Bilateral Filtering for Video Coding,” 2017 IEEE Visual Communications and Image Processing (VCIP), IEEE, Dec. 2017 (4 pages). |
Chen, Y., et al., “Description of SDR, HDR and 360° video coding technology proposal by Qualcomm and Technicolor—low and high complexity versions,” Joint Video Exploration Team (JVET) of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/WG 11, Document: JVET-J0021, 10th Meeting: San Diego, US Apr. 2018 (43 pages). |
Number | Date | Country | |
---|---|---|---|
20210274171 A1 | Sep 2021 | US |
Number | Date | Country | |
---|---|---|---|
62697243 | Jul 2018 | US |