The present disclosure relates to the field of audio and video technologies, and in particular, relates to coding and decoding methods, a coder and decoder, and storage mediums.
With the development of Internet technologies and computer technologies, more and more video applications are being developed, and users' demands for high-definition videos in the video applications are increasing. However, because a high-definition video generally contains a large amount of data, before transmission in a limited network bandwidth, the high-definition video needs to be coded. Data coding generally includes: intra prediction (or inter prediction), transform, quantization, entropy coding, in-loop filtering, and the like. During coding, a residual block, which may be referred to as a transform unit (TU) or a residual signal of a current block, is acquired by intra prediction, and a transform coefficient is acquired by transforming the TU (the transform refers to conversion of an image depicted in the form of pixels in a spatial domain to an image expressed in the form of the transform coefficient in a transform domain). Afterwards, coded data is acquired by performing quantization and entropy coding on the transform coefficient.
Embodiments of the present disclosure provide coding and decoding methods, a coder and decoder, and storage mediums. The technical solutions are as follows.
According to an aspect of embodiments of the present disclosure, a decoding method is provided. The decoding method includes:
In some embodiments, determining the transform pair corresponding to the current block based on the transform pair index includes: determining, based on the transform pair index corresponding to the current block and a preconfigured corresponding relationship between the transform pair index and the transform pair, the transform pair corresponding to the current block, wherein the corresponding relationship includes a mapping relationship between five index values of the transform pair index and five transform pairs.
In some embodiments, in response to the value of the first bit being the second value, the first bit is intended to indicate that the transform pair mapped by one of the remaining index values is selected, the transform pair mapped by one of the remaining index values being one of four transform pairs (DST7, DST7), (DCT8, DST7), (DST7, DCT8), and (DCT8, DCT8); wherein
a value of a second bit of the binarized codeword is intended to identify whether the transform pair selected by the current block is (DST7, DST7) or one of (DCT8, DST7), (DST7, DCT8), and (DCT8 DCT8);
a value of a third bit of the binarized codeword is intended to identify whether the transform pair selected by the current block is (DCT8, DST7) or one of (DST7, DCT8) and (DCT8, DCT8); and
a value of a fourth bit of the binarized codeword is intended to identify whether the transform pair selected by the current block is (DST7, DCT8) or (DCT8, DCT8).
In some embodiments, determining the transform pair corresponding to the current block based on the transform pair index includes:
determining that the transform pair corresponding to the current block is (DCT8, DCT8) in response to the first bit being 1, the second bit being 1, the third bit being 1, and the fourth bit being 1.
In some embodiments, prior to acquiring the transform pair index corresponding to the current block from the coded data, the decoding method further includes:
In some embodiments, acquiring the transform pair index corresponding to the current block from the coded data includes: acquiring the transform pair index corresponding to the current block from the coded data in response to determining that the current block is the luma block and the height and the width of the current block are both less than or equal to 32.
In some embodiments, the decoding method further includes: directly determining, instead of decoding the transform pair index corresponding to the current block, (DCT2, DCT2) as the transform pair selected by the current block in response to determining that the current block is not the luma block or the height or width of the current block is greater than 32.
In some embodiments, the first value of the first bit of the binarized codeword corresponding to the transform pair index is 0, and the second value of the first bit of the binarized codeword corresponding to the transform pair index is 1.
In some embodiments, the binarized codeword of the first index value is 0; the binarized codeword of the second index value is 10; the binarized codeword of the third index value is 110; the binarized codeword of the fourth index value is 1110; and the binarized codeword of the fifth index value is 1111.
In some embodiments, the decoding method further includes: decoding other bits by context-based adaptive binary arithmetic coding (CABAC) in response to the binarized codeword corresponding to the transform pair index further including the other bits in addition to the first bit.
In some embodiments, acquiring the coded data of the current block includes: acquiring inversely quantized data of the current block by performing entropy decoding on the coded data and performing inverse quantization on an entropy decoding result.
In some embodiments, in response to determining the transform pair corresponding to the current block, the decoding method further includes: acquiring a residual signal corresponding to the current block by performing reverse transform processing on inversely quantized data of the current block with the transform pair, and then acquiring reconstruction information corresponding to the current block by adding the residual signal and a prediction signal.
In some embodiments, the current block is a transform unit, and the current block is a coding unit acquired by partitioning a coding tree unit using one or more of quad-tree partitioning, horizontal binary-tree partitioning, vertical binary-tree partitioning, horizontal triple-tree partitioning and vertical triple-tree partitioning.
In some embodiments, the current block is a transform unit whose width is greater than its height, or the current block is a transform unit whose width is equal to its height, or the current block is a transform unit whose width is less than its height.
According to another aspect of embodiments of the present disclosure, a coding method is provided. The coding method includes:
In some embodiments, determining the transform pair corresponding to the current block and the transform pair index corresponding to the current block includes: determining the transform pair corresponding to the current block; and determining, based on the transform pair and a preconfigured corresponding relationship between the transform pair index and the transform pair, the transform pair index corresponding to the current block, wherein the corresponding relationship includes a mapping relationship between five index values of the transform pair index and five transform pairs.
In some embodiments, in response to the value of the first bit being the second value, the first bit is intended to indicate that the transform pair mapped by one of the remaining index values is selected, the transform pair mapped by one of the remaining index values being one of four transform pairs (DST7, DST7), (DCT8, DST7), (DST7, DCT8), and (DCT8, DCT8); wherein
In some embodiments, the first bit of the binarized codeword is 0 in response to the transform pair corresponding to the current block being (DCT2, DCT2); the first bit of the binarized codeword is 1 and the second bit is 0 in response to the transform pair corresponding to the current block being (DST7, DST7); the first bit of the binarized codeword is 1, the second bit is 1, and the third bit is 0 in response to the transform pair corresponding to the current block being (DCT8, DST7); the first bit of the binarized codeword is 1, the second bit is 1, the third bit is 1, and the fourth bit is 0 in response to the transform pair corresponding to the current block being (DST7, DCT8); and the first bit of the binarized codeword is 1, the second bit is 1, the third bit is 1, and the fourth bit is 1 in response to the transform pair corresponding to the current block being (DCT8, DCT8).
In some embodiments, the coding method further includes: adding a target flag to the coded data, wherein the target flag is intended to indicate that explicit multi-kernel transform is enabled.
In some embodiments, determining the transform pair corresponding to the current block and the transform pair index corresponding to the current block includes: determining the transform pair corresponding to the current block and the transform pair index corresponding to the current block in response to determining that the current block is the luma block and the height and the width of the current block are both less than or equal to 32.
In some embodiments, the coding method further includes: directly determining, instead of coding the transform pair index corresponding to the current block, (DCT2, DCT2) as the transform pair selected by the current block, in response to determining that the current block is not the luma block or the height or width of the current block is greater than 32.
In some embodiments, the first value of the first bit of the binarized codeword corresponding to the transform pair index is 0, and the second value of the first bit of the binarized codeword corresponding to the transform pair index is 1.
In some embodiments, the binarized codeword of the first index value is 0; the binarized codeword of the second index value is 10; the binarized codeword of the third index value is 110; the binarized codeword of the fourth index value is 1110; and the binarized codeword of the fifth index value is 1111.
In some embodiments, the coding method further includes: coding other bits by context-based adaptive binary arithmetic coding (CABAC) in response to the binarized codeword corresponding to the transform pair index further including the other bits in addition to the first bit.
In some embodiments, the coding method further includes: acquiring a transform coefficient by transforming the current block based on the transform pair, acquiring a quantization coefficient by quantizing the transform coefficient, and acquiring the coded data corresponding to the current block by performing entropy coding on the quantization coefficient.
In some embodiments, in response to determining the transform pair corresponding to the current block, the coding method further includes: acquiring a transform coefficient by transforming a residual signal of the current block based on the transform pair, acquiring a quantization coefficient by quantizing the transform coefficient, and acquiring the coded data corresponding to the current block by performing entropy coding on the quantization coefficient; and adding the transform pair index as coded to the coded data.
In some embodiments, the current block is a transform unit, and the current block is a coding unit acquired by partitioning a coding tree unit using one or more of quad-tree partitioning, horizontal binary-tree partitioning, vertical binary-tree partitioning, horizontal triple-tree partitioning and vertical triple-tree partitioning.
In some embodiments, the current block is a transform unit whose width is greater than its height, or the current block is a transform unit whose width is equal to its height, or the current block is a transform unit whose width is less than its height.
According to another aspect of embodiments of the present disclosure, a decoder is provided. The decoder includes: a processor and a memory storing at least one instruction executable by the processor; wherein the processor, when loading and executing the at least one instruction, is caused to perform the decoding method according to any one of the aforesaid embodiments.
According to another aspect of embodiments of the present disclosure, a coder is provided. The coder includes: a processor and a memory storing at least one instruction executable by the processor; wherein the processor, when loading and executing the at least one instruction, is caused to perform the coding method according to any one of the aforesaid embodiments.
According to another aspect of embodiments of the present disclosure, a decoding device is provided. The decoding device is configured to perform the decoding method according to any one of the aforesaid embodiments.
According to another aspect of embodiments of the present disclosure, a coding device is provided. The coding device is configured to perform the coding method according to any one of the aforesaid embodiments.
According to another aspect of embodiments of the present disclosure, a non-transitory computer-readable storage medium, storing at least one instruction executable by a processor; wherein the at least one instruction, when loaded and executed by the processor, causes the processor to perform the decoding method according to any one of the aforesaid embodiments or the coding method according to any one of the aforesaid embodiments.
For clearer descriptions of the objectives, technical solutions and advantages in the present disclosure, the embodiments of the present disclosure are described in further detail hereinafter with reference to the accompanying drawings.
In the related art, during transform, the transform coefficient is generally acquired by transforming the TU using a preset transform pair (the transform pair includes a horizontal transform kernel and a vertical transform kernel). Correspondingly, during decoding, the residual signal is acquired by inverse transforming the TU using the preset transform pair used during coding.
In this way, as significantly different compression effects are achieved when the same TU is transformed using different transform pairs, poor coding and decoding performance may be caused when all the TUs are transformed using the same preset transform pair.
The present disclosure provides a coding method and a decoding method. The coding method may be performed by a coding device. The decoding method may be performed a decoding device. Further, the coding device or the decoding device may be a device capable of coding and/or decoding video data, such as a server, a computer, or a mobile phone.
A processor, a memory, a transceiver, and the like may be disposed in the coding device or the decoding device. The processor may be configured to code and/or decode data. The memory may be configured to store data required for and data generated in a coding and/or decoding process. The transceiver may be configured to receive and transmit data, for example, to acquire the video data.
Concepts possibly involved in the embodiments of the present disclosure are explained first before the embodiments are described.
In video coding, transform is an indispensable phase for video data compression, and enables energy of signals to be more concentrated. In addition, a transform technique based on discrete cosine transform (DCT)/discrete sine transform (DST) has been a mainstream transform technique of video coding. Each of the DCT and the DST specifically includes a plurality of transform kernels based on different basis functions. The basis functions of three commonly-used transform kernels are given in Table 1.
In video coding, a transform process includes a forward transform and an inverse transform, which are also referred to as a forward transform and a backward transform. Forward transform means that a two-dimensional residual signal (a residual coefficient) is converted to a two-dimensional spectrum signal (a transform coefficient) with energy more concentrated and then the transform coefficient is quantized, such that a high-frequency component is effectively removed and intermediate-frequency and low-frequency components are retained, thereby achieving the effect of compression. It is expressed in a matrix as formula (1):
wherein M represents a width of a residual block, N represents a height of the residual block, f represents an original residual signal of N*M dimensions, and F represents a frequency-domain signal of N*M dimensions. A and B represent an M*M-dimensional transform matrix and an N*N-dimensional transform matrix respectively, both of which satisfy orthogonality.
Inverse transform, also called reverse transform, is an inverse process of forward transform. That is, the frequency-domain signal F is converted to a time-domain residual signal f by orthogonal transform matrices A and B. It is expressed in a matrix as formula (2):
In a transform phase of video coding, a two-dimensional residual signal is input. As shown in formula (3), if X=A·fT, F=B·XT.
Therefore, the forward transform of one two-dimensional residual signal is realized by one-dimensional forward transform twice. Upon the first forward transform, an M*N signal X is acquired and a correlation between pixels in a horizontal direction of the two-dimensional residual signal is canceled. Therefore, the first forward transform is referred to as a horizontal transform, and A is referred to as a horizontal transform matrix. Upon the second forward transform, a correlation between pixels in a vertical direction of the two-dimensional residual signal is canceled. Therefore, the second forward transform is referred to as a vertical transform, and B is referred to as a vertical transform matrix.
In a next-generation video coding standard, a transform unit (TU) may be a rectangular block. Therefore, M is not necessarily equal to N, and thus the dimensions of A and B are not necessarily equal. In addition, the next-generation video coding standard supports that A and B are not transform matrices produced by the same transform kernel. Thus, there is a transform pair {H, V} composed of transform kernels corresponding to A and B respectively in the transform, where H is referred to as a horizontal transform kernel and V is referred to as a vertical transform kernel.
In high-efficiency video coding (HEVC), a 64*64 coding tree unit (CTU) is recursively partitioned into coding units (CU) using a quadtree. Whether to use intra-frame coding or inter-frame coding is determined at a leaf node CU level. The CU is further partitioned into two or four prediction units (PU), and the same prediction information is used in the same PU. After a residual signal is acquired upon completion of prediction, one CU is further partitioned into a plurality of TUs using the quad-tree.
However, in newly-proposed versatile video coding (VVC), a block partition technology has changed greatly. An original partition mode is replaced with a binary-tree/triple-tree/quad-tree (BT/TT/QT)) hybrid partition structure, the original concepts of CU, PU, and TU are canceled, and a more flexible partition mode of the CU is supported. The CU is subjected to square or rectangular partitioning. The CTU is firstly subjected to quad-tree partitioning, and then each of leaf nodes acquired from quad-tree partitioning is further subjected to binary-tree partitioning and triple-tree partitioning. That is, totally five partitioning schemes are available: quad-tree partitioning, horizontal binary-tree partitioning, vertical binary-tree partitioning, horizontal triple-tree partitioning and vertical triple-tree partitioning, as shown in
Therefore, based on the partition schemes, the block usually has three shapes as shown in
Intra prediction means that considering a strong spatial-domain correlation between adjacent blocks in an image, a currently uncoded block is predicted by using surrounding pixels, which have been reconstructed, as reference pixels. Therefore, only a residual signal (an original signal—a prediction signal) needs to be subjected to subsequent coding, instead of coding the original signal. In this way, the spatial-domain redundancy is effectively removed and the compression efficiency of a video signal is greatly improved. In addition, in intra prediction, more densely arranged angles achieve better the prediction effects.
In
It should be noted that in a plane rectangular coordinate system, the horizontal rightward direction is a positive direction of the x-axis, and a vertical upward direction is a positive direction of the y-axis. In this way, an angle formed by a ray with the origin as a vertex (the ray is in a direction distal from the origin) in each of a first quadrant and a second quadrant and the positive direction of the x-axis is a positive, and an angle formed by a ray with the origin as a vertex (the ray is in the direction distal from the origin) in each of a third quadrant and a fourth quadrant and the positive direction of the x-axis is negative. For example, an angle between the horizontal rightward direction and the axis of symmetry (in the direction distal from the origin) in the fourth quadrant is −45 degrees.
An embodiment of the present disclosure provides a common coding frame. As shown in
In the transform process, for the same TU (which may also be referred to as a residual block or a current block), when different transform pairs are used to compress the residual block, the compression effects are quite different. This is determined by a basis function of the transform kernel itself. As shown in
For more intuitively experiencing relationships between different transform kernels and residual properties, as shown in
The context model refers to a process of updating a symbol probability based on a context in video coding.
An embodiment of the present disclosure provides a coding method. A flow of this method may be as shown in
In step 801, a coding device acquires a residual signal of a current block.
In practice, where coding video data, the coding device firstly performs intra prediction to acquire the residual signal (a fashion in which a residual block is acquired is identical with that in the existing video coding standard and is not repeatedly described here), and then takes the residual signal as a residual signal of the current block to be processed currently.
A fashion in which an intra prediction mode is selected may be as follows.
Generally, two major indexes are available for evaluating coding efficiency: the bit rate and the peak signal-to-noise ratio (PSNR). Generally, the smaller the bit stream is, the higher the compression rate is; and the higher the PSNR is, the better the quality of a reconstructed image is. When the mode is selected, a discriminant formula is essentially a comprehensive evaluation of these two indexes.
The rate-distortion cost corresponding to the mode is J (mode)=D+λ*R, wherein D represents distortion which is usually measured by a sum of squares error (SSE) index, the SSE refers to a mean sum of square of a difference value between a reconstructed block and a source image, λ is the Lagrangian multiplier, and R is the actual number of bits required for coding the image block in the intra prediction mode, including the sum of bits required for coding mode information, motion information, the residual signal and the like.
The coding device may acquire an explicit multi-kernel transform syntax table. As shown in Table 2, as long as one intra prediction mode is selected, each transform pair in Table 2 is selected for transform processing, quantization, entropy coding, and decoding. In this way, all the intra prediction modes are traversed, then the intra prediction mode and the transform pair, which achieve the lowest rate-distortion cost, are selected, and this intra prediction mode is determined as the intra prediction mode corresponding to the current block. In this way, the intra prediction mode and the transform pair which correspond to the current block may be determined. For example, 67 intra prediction modes are available, and five transform pairs are available for each of these intra prediction modes. Thus, 67*5 combinations are available, and each combination includes one intra prediction mode and one transform pair. The combination with the lowest rate-distortion cost is selected for final intra prediction and transform.
It should be noted that in Table 2, alternatively, DCT8 is replaced with DCT4, and DST7 is replaced with DST4, or DCT8 and DST7 are replaced with other transform kernels.
In addition, as shown in Table 3, for shortening a coding time, the transform pair (DCT8, DCT8) is deleted based on Table 2, that is, an RDO decision for (DCT8, DCT8) is not performed. In this way, the number of combinations is reduced from 67*5 to 67*4, and the number of RDO decisions is reduced. Thus, the coding time is shortened. In addition, since the transform pair (DCT8, DCT8) is not available, the number of bits of a binarized codeword corresponding to a transform pair index is also reduced from 4 to 3. Thus, the bit-rate overhead of coding is also reduced.
It should be noted that in Table 3, alternatively, DCT8 is replaced with DCT2, or DCT8 is replaced with DCT2 when a preset shape constraint condition is satisfied.
In addition, as shown in Table 4, for shortening the coding time, the transform pair (DST7, DCT8) is deleted based on Table 3, that is, an RDO decision for (DST7, DCT8) is not performed. In this way, the number of combinations is reduced from 67*4 to 67*3, and the number of RDO decisions is reduced. Thus, the coding time is shortened. In addition, since the transform pairs (DCT8, DCT8) and (DST7, DCT8) are not available, the number of bits of the binarized codeword corresponding to the transform pair index is also reduced from 3 to 2. Thus, the bit-rate overhead of coding is also reduced.
In addition, as shown in Table 5, for shortening the coding time, the transform pair (DCT8, DST7) is deleted based on Table 3, that is, an RDO decision for (DCT8, DST7) is not made. In this way, the number of combinations is reduced from 67*4 to 67*3, and the number of RDO decisions is reduced. Thus, the coding time is shortened. In addition, since the transform pairs (DCT8, DCT8) and (DCT8, DST7) are not available, the number of bits of the binarized codeword corresponding to the transform pair index is also reduced from 3 to 2. Thus, the bit-rate of coding is also reduced.
In addition, as shown in Table 6, for shortening the coding time, the transform pairs (DCT8, DCT8), (DCT8, DST7), and (DST7, DCT8) are deleted based on Table 2, that is, RDO decisions for (DCT8, DCT8), (DCT8, DST7), and (DST7, DCT8) are not made. In this way, the number of combinations is reduced from 67*5 to 67*2, and the number of RDO decisions is reduced. Thus, the coding time is shortened. In addition, since transform pairs (DCT8, DCT8), (DCT8, DST7), and (DST7, DCT8) are not available, the number of bits of the binarized codeword corresponding to the transform pair index is also reduced from 3 to 1. Thus, the bit-rate of coding is also reduced.
It should be noted that in Tables 4 and 5, alternatively, DCT8 is replaced with DCT2, or DCT8 is replaced with DCT2 when a preset shape constraint condition is satisfied.
It should be noted that when Table 4 is used, although the number of RDO decisions is reduced, the coding performance deteriorates.
In addition, it should also be noted that for Table 2, a first bit of the binarized codeword corresponding to the transform pair index is configured to identify whether the current block uses a transform pair corresponding to the transform pair index 1 or a transform pair corresponding to any of the transform pair indexes 2 to 5; a second bit is coded to identify whether the current block uses a transform pair corresponding to the transform pair index 2 or a transform pair corresponding to any of the transform pair indexes 3 to 5; a third bit is coded to identify whether the current block uses a transform pair corresponding to the transform pair index 3 or a transform pair corresponding to any of the transform pair indexes 4 and 5; and a fourth bit is coded to identify whether the current block uses a transform pair corresponding to the transform pair index 4 or a transform pair corresponding to the transform pair index 5. That is, if the first bit is 0, then the transform pair (DCT2, DCT2) is used; if the first bit is 1 and the second bit is 0, then the transform pair (DST7, DST7) is used; if the first bit is 1, the second bit is 1, and the third bit is 0, then the transform pair (DCT8, DST7) is used; if the first bit is 1, the second bit is 1, the third bit is 1, and the fourth bit is 0, then the transform pair (DST7, DCT8) is used; and if the first bit is 1, the second bit is 1, the third bit is 1, and the fourth bit is 1, then the transform pair (DCT8, DCT8) is used.
In addition, it should also be noted that for Table 3, the first bit of the binarized codeword corresponding to the transform pair index is configured to identify whether the current block uses a transform pair corresponding to the transform pair index 1 or a transform pair corresponding to any of the transform pair indexes 2 to 4; the second bit is coded to identify whether the current block uses a transform pair corresponding to the transform pair index 2 or a transform pair corresponding to any of the transform pair indexes 3 and 4; and the third bit is coded to identify whether the current block uses a transform pair corresponding to the transform pair index 3 or a transform pair corresponding to the transform pair index 4. That is, if the first bit is 0, then the transform pair (DCT2, DCT2) is used; if the first bit is 1 and the second bit is 0, then the transform pair (DST7, DST7) is used; if the first bit is 1, the second bit is 1, and the third bit is 0, then the transform pair (DCT8, DST7) is used; and if the first bit is 1, the second bit is 1, and the third bit is 1, then the transform pair (DST7, DCT8) is used.
In addition, it should also be noted that for Table 4, the first bit of the binarized codeword corresponding to the transform pair index is configured to identify whether the current block uses a transform pair corresponding to the transform pair index 1 or a transform pair corresponding to any of the transform pair indexes 2 and 3; the second bit is coded to identify whether the current block uses a transform pair corresponding to the transform pair index 2 or a transform pair corresponding to the transform pair index 3. That is, if the first bit is 0, then the transform pair (DCT2, DCT2) is used; if the first bit is 1 and the second bit is 0, then the transform pair (DST7, DST7) is used; and if the first bit is 1 and the second bit is 1, then the transform pair (DCT8, DST7) is used.
In addition, it should also be noted that for Table 5, the first bit of the binarized codeword corresponding to the transform pair index is configured to identify whether the current block uses a transform pair corresponding to the transform pair index 1 or a transform pair corresponding to any of the transform pair indexes 2 and 3; and the second bit is coded to identify whether the current block uses a transform pair corresponding to the transform pair index 2 or a transform pair corresponding to the transform pair index 3. That is, if the first bit is 0, then the transform pair (DCT2, DCT2) is used; if the first bit is 1 and the second bit is 0, then the transform pair (DST7, DST7) is used; and if the first bit is 1 and the second bit is 1, then the transform pair (DST7, DCT8) is used.
In addition, it should also be noted that for Table 6, the first bit of the binarized codeword corresponding to the transform pair index is configured to identify whether the current block uses a transform pair corresponding to the transform pair index 1 or a transform pair corresponding to the transform pair index 2. That is, if the first bit is 0, then the transform pair (DCT2, DCT2) is used; and if the first bit is 1, then the transform pair (DST7, DST7) is used.
In step 802, the coding device determines the transform pair corresponding to the current block and the transform pair index corresponding to the current block.
The transform pair includes a vertical transform kernel and a horizontal transform kernel.
In practice, the coding device acquires the finally-selected transform pair and then selects the transform pair index corresponding to this transform pair based on a corresponding relationship between the transform pair and the transform pair index, which is described in any of Tables 2-6.
It should also be noted that during coding, generally, which one of Tables 2-6 is used has been determined, and thus only one of Tables 2-6 is acquired.
In an example embodiment of the present disclosure, before step 802, the following determination may also be performed.
The coding device determines that the height and width of the current block are less than or equal to a preset threshold and the current block is a luma block.
The preset threshold is predefined and stored to the coding device and is generally N (N can be 32).
In practice, after step 801 is performed, the coding device determines the number of pixels of the current block in a height direction, i.e., the height of the current block, and determine the number of pixels of the current block in a width direction, i.e., the width of the current block. In addition, the coding device determines whether the current block is the luma block. If the current block is the luma block and the height and width of the current block are less than or equal to the preset threshold, step 802 is performed.
In an example embodiment of the present disclosure, the transform pair index is also determined based on the intra prediction mode or shape information of the current block, as shown below.
Fashion I. The coding device determines the transform pair corresponding to the current block, and determines, based on the intra prediction mode of the current block and the transform pair, the transform pair index corresponding to the current block.
In practice, the coding device determines the transform pair and the intra prediction mode of the current block in the fashion mentioned above, then acquires a preset syntax table of an explicit multi-kernel transform pair, and determines, based on the intra prediction mode and the transform pair in this syntax table, the transform pair index corresponding to the current block.
In an example embodiment of the present disclosure, the transform pair index is determined based on the intra prediction mode in the following fashion and the corresponding processing is as follows.
It is determined that the transform pair index corresponding to the current block is a first index if the transform pair is a first transform pair and a mode number of the intra prediction mode of the current block is less than or equal to a preset value. It is determined that the transform pair index corresponding to the current block is the first index if the transform pair is a second transform pair and the mode number of the intra prediction mode of the current block is greater than the preset value. It is determined that the transform pair index corresponding to the current block is a second index if the transform pair is the second transform pair and the mode number of the intra prediction mode of the current block is less than or equal to the preset value. It is determined that the transform pair index corresponding to the current block is the second index if the transform pair is the first transform pair and the mode number of the intra prediction mode of the current block is greater than the preset value.
In practice, the preset syntax table of the explicit multi-kernel transform pair is as shown in Table 7.
In Table 7, if the first transform pair is (DST7, DCT8) and the mode number of the intra prediction mode is less than or equal to the preset value (the preset value is 34), it is determined that the transform pair index is 3 and thus the corresponding binarized codeword has three bits, which are 1, 1, and 0 in sequence. If the second transform pair is (DCT8, DST7) and the mode number of the intra prediction mode is greater than 34, it is determined that the corresponding transform pair index is 3 and thus the corresponding binarized codeword has three bits, which are 1, 1, and 0 in sequence. If the second transform pair is (DCT8, DST7) and the mode number of the intra prediction mode is less than or equal to 34, it is determined that the corresponding transform pair index is 4 and thus the corresponding binarized codeword has four bits, which are 1, 1, 1, and 0 in sequence. If the first transform pair is (DST7, DCT8) and the mode number of the intra prediction mode is greater than 34, it is determined that the transform pair index is 4 and thus the corresponding binarized codeword has four bits, which are 1, 1, 1, and 0 in sequence.
It should also be noted that as the transform pair index is directly determined based on some transform pairs, the condition which is satisfied when the fashion I is used is that the determined transform pair is not any one of (DCT2, DCT2), (DST7, DST7), and (DCT8, DCT8).
It should also be noted that in Table 7, where the transform pair index is 3, the Mode=0-34?DST7: DCT8 means that if it is true that the mode number of the intra prediction mode is 0 to 34, the horizontal transform kernel is DST7, and otherwise, the horizontal transform kernel is DCT8, and the Mode=0-34? DCT8: DST7 means that if it is true that the mode number of the intra prediction mode is 0 to 34, the vertical transform kernel is DCT8, and otherwise, the vertical transform kernel is DST7. Where the transform pair index is 4, the Mode=0-34? DCT8: DST7 means that if it is true that the mode number of the intra prediction mode is 0 to 34, the horizontal transform kernel is DCT8, and otherwise, the horizontal transform kernel is DST7, and the Mode=0-34? DST7: DCT8 means that if it is true that the mode number of the intra prediction mode is 0 to 34, the vertical transform kernel is DST7, and otherwise, the vertical transform kernel is DCT8.
Fashion II: The coding device determines the transform pair corresponding to the current block, and determines, based on the shape information of the current block and the transform pair, the transform pair index corresponding to the current block.
In practice, the coding device determines the transform pair and the shape information of the current block in the fashion mentioned above, then acquires the preset syntax table of the explicit multi-kernel transform pair, and determines, based on the transform pair and the shape information of the current block in this table, the transform pair index corresponding to the current block.
In an example embodiment of the present disclosure, the processing of determining the transform pair index with reference to the shape information of the current block may be as follows.
It is determined that the transform pair index corresponding to the current block is a first index if the transform pair is a first transform pair and the shape information of the current block satisfies a preset shape constraint condition. It is determined that the transform pair index corresponding to the current block is the first index if the transform pair is a second transform pair and the shape information of the current block does not satisfy the preset shape constraint condition. It is determined that the transform pair index corresponding to the current block is a second index if the transform pair is the second transform pair and the shape information of the current block satisfies the preset shape constraint condition. It is determined that the transform pair index corresponding to the current block is the second index if the transform pair is the first transform pair and the shape information of the current block does not satisfy the preset shape constraint condition.
The preset shape constraint condition may be preset and stored to the coding device. The preset shape constraint condition is that the width is greater than or equal to the height.
In practice, the preset syntax table of the explicit multi-kernel transform pair is as shown in Table 8.
In Table 8, if the first transform pair is (DST7, DCT8) and the shape information of the current block is that the width is greater than or equal to the height, it is determined that the transform pair index is 3 and thus the corresponding binarized codeword has three bits, which are 1, 1, and 0 in sequence. If the second transform pair is (DCT8, DST7) and the shape information of the current block is that the width is less than the height, it is determined that the corresponding transform pair index is 3 and thus the corresponding binarized codeword has three bits, which are 1, 1, and 0 in sequence. If the second transform pair is (DCT8, DST7) and the shape information of the current block is that the width is greater than or equal to the height, it is determined that the corresponding transform pair index is 4 and thus the corresponding binarized codeword has four bits, which are 1, 1, 1, and 0 in sequence. If the first transform pair is (DST7, DCT8) and the shape information of the current block is that the width is less than the height, it is determined that the transform pair index is 4 and thus the corresponding binarized codeword has four bits, which are 1, 1, 1, and 0 in sequence.
Based on Table 8, in the fashion II, where the shape information of the current block indicates that the width is greater than or equal to the height and the first transform pair is (DST7, DCT8), the first index is 3; where the shape information of the current block indicates that the width is less than the height and the second transform pair is (DCT8, DST7), the first index is 3; where the shape information of the current block indicates that the width is less than the height and the first transform pair is (DST7, DCT8), the second index is 4; and where the shape information of the current block indicates the width is greater than or equal to the height and the second transform pair is (DCT8, DST7), the second index is 4.
It should be noted that as the transform pair index may be directly determined based on some transform pairs, the condition which is satisfied when the fashion II is used is that the determined transform pair is not any one of (DCT2, DCT2), (DST7, DST7), and (DCT8, DCT8).
It should also be noted that in Table 8, where the transform pair index is 3, the W≥H? DST7: DCT8 means that if it is true that the width of the current block is greater than or equal to its height, the horizontal transform kernel is DST7, and otherwise, the horizontal transform kernel is DCT8; and the W≥H? DCT8: DST7 means that if it is true that the width of the current block is greater than or equal to its height, the vertical transform kernel is DCT8, and otherwise, the vertical transform kernel is DST7. Where the transform pair index is 4, the W≥H? DCT8: DST7 means that if it is true that the width of the current block is greater than or equal to its height, the horizontal transform kernel is DCT8, and otherwise, the horizontal transform kernel is DST7; and the W≥H? DST7: DCT8 means that if it is true that the width of the current block is greater than or equal to the height, the vertical transform kernel is DST7, and otherwise, the vertical transform kernel is DCT8.
It should also be noted that in Table 7, Mode represents the mode number. In Table 8, W represents the width and H represents the height.
In this way, due to adaptive adjustment of the priority of the transform pair based on the shape information and the intra prediction mode of the current block, the higher the probability of the transform pair is, the shorter the binarized codeword corresponding to the transform pair index that needs to be coded is.
In addition, in response to acquiring the current block, the coding device determines whether the height and width of the current block both are less than or equal to N (N may be 32), and determines whether the current block is the luma block. If the height and width of the current block both are less than or equal to N and the current block is the luma block, the coding device continues to perform step 802. If at least one of the conditions that the height and width of the current block both are less than or equal to N and the current block is the luma block is not satisfied, the coding device directly acquires a preset transform pair, i.e., (DCT2, DCT2).
In step 803, the coding device acquires coded data corresponding to the current block by coding the residual signal of the current block based on the transform pair.
In practice, in response to acquiring the transform pair corresponding to the current block, the coding device transforms the residual signal of the current block based on the transform pair to acquire a transform coefficient, then quantize the transform coefficient to acquire a quantization coefficient and perform entropy coding on the quantization coefficient to acquire the coded data corresponding to the current block.
In step 804, the coding device codes the transform pair index by subjecting a first bit of a binarized codeword corresponding to the transform pair index to context-based adaptive binary arithmetic coding based on one context model, and adds the coded transform pair index to the coded data of the current block.
In practice, the transform pair index is added to the coded data, such that a decoding device acknowledges the transform pair used by the coding device.
The coding device performs the context-based adaptive binary arithmetic coding based on the one context model on the first bit of the binarized codeword corresponding to the transform pair index. If the binarized codeword further includes other bits and these bits may be coded by context-based adaptive binary arithmetic coding (CABAC) or coded by bypass binary arithmetic coding. Then the coding device adds the coded transform pair index to the coded data of the current block.
Thus, the coding of the current block is completed. Each current block is processed according to the flowchart shown in
In an example embodiment of the present disclosure, the other bits is coded by the bypass binary arithmetic coding and the corresponding processing is as follows.
If the binarized codeword corresponding to the transform pair index includes a plurality of bits, the first bit is code by the context-based adaptive binary arithmetic coding based on the one context model and at least one of bits, except the first bit, among the plurality of bits is coded by the bypass binary arithmetic coding, and then the coded transform pair index is added to the coded data of the current block.
In practice, when the binarized codeword corresponding to the transform pair index includes the plurality of bits, the adaptive binary arithmetic coding is performed on the first bit based on the one context model and at least one of bits, except the first bit, among the plurality of bits is coded by the bypass binary arithmetic coding, and then the coded transform pair index is added to the coded data of the current block.
For example, if the used transform pair is (DCT8, DCT8) and there are four bits correspondingly, which are 1, 1, 1, and 1 in sequence, the first bit is coded based on the one context model and the next three bits are coded by the bypass binary arithmetic coding. In this way, there is no need to store context models of the next few bits, such that the memory space can be saved and the coding and decoding complexity is lowered.
In addition, a target flag is added to the coded data such that the decoding device uses an explicit multi-kernel transform mode, wherein the target flag indicates that the explicit multi-kernel transform mode is enabled.
For the coding mode shown in
In step 901, a decoding device acquires coded data of a current block.
In practice, when there is coded data to be decoded, the decoding device acquires the coded data, then acquires the coded data of the current block by performing entropy decoding on the coded data and performing inverse quantization on an entropy decoding result.
In step 902, the decoding device acquires a transform pair index of the current block from the coded data, wherein a first bit of a binarized codeword corresponding to the transform pair index is decoded by context-based adaptive binary arithmetic coding based on one context model.
In practice, the decoding device acquires the transform pair index corresponding to the current block from the coded data of the current block. When the coding device codes the transform pair index of the current block, adaptive binary arithmetic coding based on the one context model is performed on the first bit of the binarized codeword corresponding to the transform pair index. Thus, when decoding the first bit, the decoding device also performs adaptive binary arithmetic coding based on the one context model on the first bit.
In an example embodiment of the present disclosure, before step 902, the following determination may also be performed.
The decoding device determines that the height and width of the current block both are less than or equal to a preset threshold and the current block is a luma block.
The preset threshold is predefined and stored to the decoding device and is generally N (N can be 32).
In practice, after step 901 is performed, the decoding device determines the number of pixels of the current block in a height direction, i.e., the height of the current block, and the number of pixels of the current block in a width direction, i.e., the width of the current block. In addition, the decoding device determines whether the current block is the luma block. If the current block is the luma block and the height and width of the current block both are less than or equal to the preset threshold, step 902 is performed.
In addition, before step 902 is performed, whether a target flag is carried in the coded data is also determined, and the target flag indicates that explicit multi-kernel transform processing is performed. If the coded data includes the target flag, then the explicit multi-kernel transform processing is enabled, and then step 902 can be performed.
In step 903, the decoding device determines, based on the transform pair index, a transform pair corresponding to the current block, wherein the transform pair includes a horizontal transform kernel and a vertical transform kernel.
In practice, in response to determining the transform pair index of the current block, the decoding device determines, based on a corresponding relationship between the transform pair index and the transform pair, the transform pair corresponding to the current block.
In step 904, the decoding device acquires reconstruction information corresponding to the current block by decoding the current block based on the transform pair.
In practice, in response to determining the transform pair corresponding to the current block, the decoding device acquires a residual signal corresponding to the current block by performing reverse transform processing on inversely quantized data of the current block with the transform pair, and then acquires the reconstruction information corresponding to the current block by adding the residual signal and a prediction signal of the current block.
Thus, the decoding of the current block is completed. Each current block is processed according to the flowchart shown in
In step 903, the transform pair of the current block may be determined in a variety of fashions and the variety of feasible fashions is given as follows.
Fashion I: As shown in Table 2, where the transform kernel index is 1, the used transform pair is (DCT2, DCT2); where the transform kernel index is 2, the used transform pair is (DST7, DST7); where the transform kernel index is 3, the used transform pair is (DCT8, DST7); where the transform kernel index is 4, the used transform pair is (DST7, DCT8); and where the transform kernel index is 5, the used transform pair is (DCT8, DCT8).
Fashion II: The transform pair of the current block is determined based on the corresponding relationship between the transform pair index and the transform pair list in Table 3 and the transform pair index of the current block.
Fashion III: The transform pair of the current block is determined based on the corresponding relationship between the transform pair index and the transform pair list in Table 4 and the transform pair index of the current block.
Fashion IIII: The transform pair of the current block is determined based on the corresponding relationship between the transform pair index and the transform pair list in Table 5 and the transform pair index of the current block.
Fashion V: The transform pair of the current block is determined based on the corresponding relationship between the transform pair index and the transform pair list in Table 6 and the transform pair index of the current block.
In an example embodiment of the present disclosure, the transform pair is also determined based on intra-frame mode information of the current block or shape information of the current block, and the corresponding processing may be as follows.
The decoding device determines, based on an intra prediction mode and the transform pair index of the current block, the transform pair corresponding to the current block; or the decoding device determines, based on the transform pair index, and the width and height of the current block, the transform pair corresponding to the current block.
In practice, it is determined that the transform pair corresponding to the current block is a first transform pair if the transform pair index is a first index and a mode number of the intra prediction mode of the current block is less than or equal to a preset value; it is determined that the transform pair corresponding to the current block is a second transform pair if the transform pair index is the first index and the mode number of the intra prediction mode of the current block is greater than the preset value; it is determined that the transform pair corresponding to the current block is the second transform pair if the transform pair index is a second index and the mode number of the intra prediction mode of the current block is less than or equal to the preset value; and it is determined that the transform pair corresponding to the current block is the first transform pair if the transform pair index is the second index and the mode number of the intra prediction mode of the current block is greater than the preset value.
Alternatively, it is determined that the transform pair corresponding to the current block is a first transform pair if the transform pair index is a first index and the shape information of the current block satisfies a preset shape constraint condition; it is determined that the transform pair corresponding to the current block is a second transform pair if the transform pair index is the first index and the shape information of the current block does not satisfy the preset shape constraint condition; it is determined that the transform pair corresponding to the current block is the second transform pair if the transform pair index is a second index and the shape information of the current block satisfies the preset shape constraint condition; and it is determined that the transform pair corresponding to the current block is the first transform pair if the transform pair index is the second index and the shape information of the current block does not satisfy the preset shape constraint condition (this process corresponds to the process of step 803 and is not repeatedly descried here).
The preset value is 34 in Table 7, the first index is 3, the first transform pair is (DST7, DCT8), the second index is 4, and the second transform pair is (DCT8, DST7).
In an example embodiment of the present disclosure, for saving the memory space of the coding device, when the coding device codes the binarized codeword corresponding to the transform pair index, if the binarized codeword corresponding to the transform pair index includes a plurality of bits, at least one of bits, except the first bit, among the plurality of bits is coded by a bypass binary arithmetic coding. In this way, if a bit is coded by the bypass binary arithmetic coding mode, there is no need to store the context mode. Thus, the memory space can be saved. Likewise, when performing decoding, the decoding device performs decoding in the corresponding fashion.
In the embodiment of the present disclosure, when coding the current block, the coding device acquires the transform pair corresponding to the current block for coding, rather than directly acquiring a preset transform pair, such that the coding and decoding performance can be improved. When coding the binarized codeword corresponding to the transform pair index, the coding device codes the first bit of the binarized codeword corresponding to the transform pair index by the one context model rather than a plurality of context models, such that the memory space can be saved. In addition, because the plurality of context models are not needed, the context does not need to be updated, which lowers the coding and decoding complexity.
Considering saving the memory space when a binarized codeword corresponding to a transform pair index includes a plurality of bits, another embodiment of the present disclosure provides the following coding and decoding process, as shown in
In step 1001, a coding device acquires a residual signal of a current block.
In practice, when coding video data, the coding device firstly acquires the residual signal by performing intra prediction (a fashion in which the residual signal is acquired is identical with that in the existing video coding standard and is not repeatedly described here), and then takes the residual signal as a residual signal of the current block to be processed currently.
The fashion in which an intra prediction mode is selected is the same as the fashion in which the intra prediction mode is selected in step 801. For details about this fashion, reference may be made to step 801, which is not repeatedly described here.
In step 1002, the coding device determines a transform pair corresponding to the current block and a transform pair index corresponding to the current block.
The transform pair includes a vertical transform kernel and a horizontal transform kernel.
In practice, the coding device acquires the finally-selected transform pair and then selects the transform pair index corresponding to this transform pair based on any of Tables 2-6 (for details about this process, reference may be made to step 802).
It should be noted that during coding, generally, which one of Tables 2-6 is used has been determined, and thus only one of Tables 2-6 is acquired.
In an example embodiment of the present disclosure, the transform pair index may be determined in a variety of fashions, and two feasible implementation fashions are given as follows.
Fashion I. The coding device determines the transform pair corresponding to the current block, and determines, based on the intra prediction mode of the current block and the transform pair, the transform pair index corresponding to the current block.
This process is identical with fashion I shown in Table 7 in step 802 and is not repeatedly described here.
Fashion II. The coding device determines the transform pair corresponding to the current block, and determines, based on shape information of the current block and the transform pair, the transform pair index corresponding to the current block.
This process is identical with fashion II shown in Table 8 in step 802 and is not repeatedly described here.
In addition, in response to acquiring the current block, the coding device firstly determines whether the height and width of the current block both are less than or equal to N (N may be 32) and determine whether the current block is a luma block. If the height and width of the current block both are less than or equal to 32 and the current block is the luma block, the coding device continues to execute step 1002. If at least one of the conditions that the height and width of the current block both are less than or equal to 32 and the current block is the luma block is not satisfied, the coding device directly acquires a preset transform pair (DCT2, DCT2).
In step 1003, the coding device acquires coded data corresponding to the current block by coding the residual signal of the current block based on the transform pair.
In practice, in response to acquiring the transform pair corresponding to the current block, the coding device acquires a transform coefficient by transforming the residual signal of the current block based on the transform pair, then acquiring a quantization coefficient by quantizing the transform coefficient, and acquires the coded data corresponding to the current block by performing entropy coding on the quantization coefficient to acquire the coded data corresponding to the current block.
In step 1004, if the binarized codeword corresponding to the transform pair index includes a plurality of bits, the coding device codes at least one of bits, except a first bit, among the plurality of bits by bypass binary arithmetic coding, and adds the coded transform pair index to the coded data of the current block.
In practice, the transform pair index is added to the coded data, such that a decoding device acknowledges the transform pair used by the coding device.
If the binarized codeword corresponding to the transform pair index includes the plurality of bits, the coding device codes the first bit of the binarized codeword corresponding to the transform pair index by CABAC. If the binarized codeword further includes other bits, at least one of the other bits is coded by the bypass binary arithmetic coding, and then the coded transform pair index is added to the coded data. In this way, since a context model does not need to be stored for the bypass binary arithmetic coding mode, when this bit is coded by the bypass binary arithmetic coding, the context model does not need to be stored for this bit.
Thus, the coding of the current block is completed. Each current block is processed according to the flowchart shown in
In an example embodiment of the present disclosure, the first bit is coded by context-based adaptive binary arithmetic coding based on one context model.
In practice, when the first bit is coded by the CABAC, adaptive binary arithmetic coding is performed based on one context model rather than a plurality of context models, such that there is no need to store the plurality of context models. Thus, the memory space can also be saved.
In addition, a target flag is added to the coded data such that the decoding device uses an explicit multi-kernel transform mode, wherein the target flag indicates that the explicit multi-kernel transform mode is enabled
Based on the coding mode shown in
In step 1101, a decoding device acquires coded data of a current block.
In practice, where coded data needs to be decoded, the decoding device acquires the coded data, then acquires an entropy decoding result by performing entropy decoding on the coded data, and acquires the coded data of the current block by performing inverse quantization on the entropy decoding result.
In step 1102, the decoding device acquires a transform pair index from the coded data, wherein at least one of bits, except a first bit, among a plurality of bits is decoded by bypass binary arithmetic coding if a binarized codeword corresponding to the transform pair index includes the plurality of bits.
In practice, the decoding device acquires the transform pair index corresponding to the current block from the coded data. When coding the transform pair index of the current block, the coding device codes at least one of bits, except the first bit, among the plurality of bits by the bypass binary arithmetic coding if the binarized codeword corresponding to the transform pair index includes the plurality of bits. In this way, when decoding bits coded by the bypass binary arithmetic coding, the decoding device also decodes the bits by bypass binary arithmetic coding. Thus, the memory space of the decoding device can also be saved.
In an example embodiment of the present disclosure, for saving the memory space, the first bit of the plurality of bits is coded by one context model.
In an example embodiment of the present disclosure, before step 1102 is performed, it is determined that the height and width of the current block both are less than or equal to a target value and the current block is a luma block.
In practice, the target value is preset and stored to the decoding device and is generally N (N may be 32). When the height and width of the current block both are less than or equal to N and the current block is the luma block, step 1102 is performed only. Otherwise, it is determined that the transform pair corresponding to the current block is (DCT2, DCT2), and subsequently this transform pair is used directly for decoding.
In addition, before step 1102 is performed, whether a target flag is carried in the coded data may also be determined, and the target flag indicates that explicit multi-kernel transform processing is performed. If the target flag is carried in the coded data, then the explicit multi-kernel transform processing is enabled, and then step 1102 is performed.
In step 1103, the decoding device determines a transform pair corresponding to the current block based on the transform pair index, wherein the transform pair includes a horizontal transform kernel and a vertical transform kernel.
This process is identical with the processing process in step 903. For details about this process, reference may be made to step 903, which is not repeatedly described here.
In an example embodiment of the present disclosure, the transform pair is also determined based on the height and width of the current block or an intra prediction mode of the current block, and the corresponding processing may be as follows.
The transform pair corresponding to the current block is determined based on the intra prediction mode of the current block and the transform pair index; or the transform pair corresponding to the current block is determined based on shape information of the current block and the transform pair index. The transform pair includes the horizontal transform kernel and the vertical transform kernel.
In practice, this process is the same as the fashion in which the transform pair is determined based on the height and width of the current block or the intra prediction mode of the current block in step 903. For details about this process, reference may be made to step 903, which is not repeatedly described here.
In step 1104, the decoding device acquires reconstruction information corresponding to the current block by decoding the current block based on the transform pair.
In practice, in response to determining the transform pair corresponding to the current block, the decoding device acquires a residual signal corresponding to the current block by performing reverse transform processing on a quantization coefficient corresponding to the current block with the transform pair. Afterwards, the decoding device constructs a prediction signal in the used intra prediction mode with pixel values of pixels in a region which has been reconstructed around the current block, and then acquires the reconstruction information corresponding to the current block by adding the residual signal and the prediction signal.
Thus, the decoding of the current block is completed. Each current block is processed according to the flowchart shown in
In the embodiment of the present disclosure, when coding the current block, the coding device acquires the transform pair corresponding to the current block for coding, rather than directly acquiring a preset transform pair, such that the coding and decoding performance can be improved. When coding the transform pair index, the coding device codes the at least one of bits, except the first bit, in the binarized codeword corresponding to the transform pair index by the bypass binary arithmetic coding, and for the at least one bit, there is no need to store the context model. Thus, the memory space can be saved. In addition, for the at least one bit, the bypass binary arithmetic coding rather than CABAC is adopted, such that there is no need to update the context model. Thus, the coding and parsing complexity can also be lowered.
In another embodiment of the present disclosure, as shown in
In step 1201, a coding device acquires a residual signal of a current block.
In practice, when coding video data, the coding device firstly acquires the residual signal by performing intra prediction (a fashion in which the residual signal is acquired is identical with that in the existing video coding standard and is not repeatedly described here), and then takes the residual signal as a residual signal of the current block to be processed currently.
It should be noted that the TU in the embodiment of the present disclosure is the same as the CU mentioned above.
The fashion in which the intra prediction mode is selected is the same as the fashion in which the intra prediction mode is selected in step 801. For details about this fashion, reference may be made to step 801, which is not repeatedly described here.
In step 1202, the coding device acquires the intra prediction mode of the current block and a transform pair corresponding to the current block, or acquires the shape information of the current block and the transform pair corresponding to the current block.
The transform pair includes a vertical transform kernel and a horizontal transform kernel.
In practice, the coding device acquires the intra prediction mode (that is, the intra prediction mode corresponding to the current block) finally used in the intra prediction in step 1201, and the transform pair (that is, the transform pair corresponding to the current block) corresponding to the intra prediction mode when the rate-distortion cost is lowest.
Alternatively, the coding device determines the height and width of the current block (that is, the number of pixels of the current block in the height direction, and the number of pixels of the current block in the width direction). In this way, the coding device acquires the shape information of the current block, and acquires the transform pair used when the rate-distortion cost is lowest (that is, the transform pair corresponding to the current block).
In addition, in response to acquiring the current block, the coding device determines whether the height and width of the current block both are less than or equal to N (N may be 32), and determines whether the current block is a luma block. If the height and width of the current block both are less than or equal to N and the current block is the luma block, the coding device continues to perform step 1202. At least one of the conditions that the height and width of the current block both are less than or equal to N and the current block is the luma block is not satisfied, the coding device directly acquires a preset transform pair, i.e., (DCT2, DCT2).
In step 1203, the coding device determines, based on the intra prediction mode of the current block and the transform pair corresponding to the current block, a transform pair index corresponding to the current block; alternatively, the coding device determines, based on the shape information of the current block and the transform pair corresponding to the current block, the transform pair index corresponding to the current block.
In practice, where the transform pair determined in step 1202 is not any one of (DCT2, DCT2), (DST7, DST7), and (DCT8, DCT8), a preset syntax table of an explicit multi-kernel transform pair (as shown in Table 7) is acquired, and the intra prediction mode of the current block and the transform pair index corresponding to the transform pair, i.e., the transform pair index corresponding to the current block, are determined from Table 7. For example, where the transform pair is (DST7, DCT8) and a mode number of the intra prediction mode of the current block is 32, the corresponding transform pair index is 3.
Alternatively, where the transform pair determined in step 1202 is not any one of (DCT2, DCT2), (DST7, DST7), and (DCT8, DCT8), the preset syntax table of the explicit multi-kernel transform pair (as shown in Table 8) is acquired, and the shape information of the current block and the transform pair index corresponding to the transform pair, i.e., the transform pair index corresponding to the current block, are determined from Table 8. For example, if the transform pair is (DST7, DCT8) and the width of the current block is greater than its height, the corresponding transform pair index is 3.
In an example embodiment of the present disclosure, the coding device determines the transform kernel index based on the intra prediction mode and the transform pair by the flowing fashion.
It is determined that the transform pair index corresponding to the current block is a first index if the transform pair is a first transform pair and a mode number of the intra prediction mode of the current block is less than or equal to a preset value. It is determined that the transform pair index corresponding to the current block is the first index if the transform pair is a second transform pair and the mode number of the intra prediction mode of the current block is greater than the preset value. It is determined that the transform pair index corresponding to the current block is a second index if the transform pair is the second transform pair and the mode number of the intra prediction mode of the current block is less than or equal to the preset value. It is determined that the transform pair index corresponding to the current block is the second index if the transform pair is the first transform pair and the mode number of the intra prediction mode of the current block is greater than the preset value.
In practice, this process is identical with the fashion I shown in Table 7 in step 802 and is not repeatedly described here.
In an example embodiment of the present disclosure, the coding device determines the transform kernel index based on the shape information of the current block and the transform pair by the following fashion.
It is determined that the transform pair index corresponding to the current block is a first index if the transform pair is a first transform pair and the shape information of the current block satisfies a preset shape constraint condition. It is determined that the transform pair index corresponding to the current block is the first index if the transform pair is a second transform pair and the shape information of the current block does not satisfy the preset shape constraint condition. It is determined that the transform pair index corresponding to the current block is a second index if the transform pair is the second transform pair and the shape information of the current block satisfies the preset shape constraint condition. It is determined that the transform pair index corresponding to the current block is the second index if the transform pair is the first transform pair and the shape information of the current block does not satisfy the preset shape constraint condition.
In practice, this process is identical with the fashion II shown in Table 8 in step 802 and is not repeatedly described here.
In step 1204, the coding device acquires coded data corresponding to the current block by coding the residual signal of the current block based on the transform pair.
In practice, in response to acquiring the transform pair corresponding to the current block, the coding device acquires a transform coefficient by transforming the residual signal of the current block based on the transform pair, then acquires a quantization coefficient by quantizing the transform coefficient, and acquire the coded data corresponding to the current block by performing entropy coding on the quantization coefficient.
In step 1205, the coding device codes the transform pair index and adds the coded transform pair index to the coded data of the current block.
In practice, a binarized codeword of the transform pair index is added to the coded data, such that a decoding device acknowledges the transform pair used by the coding device.
The coding device codes the transform pair index and then adds the coded transform pair index to the coded data.
Thus, the coding of the current block is completed. Each current block is processed according to the flow shown in
In an example embodiment of the present disclosure, for saving the memory space, the transform pair index is coded in a following fashion and the corresponding processing of step 1205 is as follows.
If the binarized codeword corresponding to the transform pair index includes a plurality of bits, a first bit is coded by CABAC and at least one of bits, except the first bit, among the plurality of bits is coded by bypass binary arithmetic coding.
In practice, if the binarized codeword corresponding to the transform pair index only includes one bit, this bit is coded by the CABAC directly. If the binarized codeword corresponding to the transform pair index includes the plurality of bits, the first bit is coded by the CABAC and at least one of bits, except the first bit, is coded by the bypass binary arithmetic coding. Then the coding device adds the coded transform pair index to the coded data of the current block. In this way, since some bits are coded by the bypass binary arithmetic coding when the binarized codeword corresponding to the transform pair index includes the plurality of bits and there is no need to store a context mode, the memory space can be saved.
In an example embodiment of the present disclosure, the coding device performs context-based adaptive binary arithmetic coding based on one context model on the first bit in the binarized codeword corresponding to the transform pair index.
In practice, the coding device codes, based on the one context model, the first bit in the binarized codeword corresponding to the transform pair index. In this way, as only one context model is enabled, only one context model is stored and the occupied memory space is relatively small, thereby saving the memory space of the coding device.
In addition, a target flag is added to the coded data such that the decoding device uses an explicit multi-kernel transform mode, wherein the target flag indicates that the explicit multi-kernel transform mode is enabled.
Based on the coding process shown in
In step 1301, a decoding device acquires coded data of a current block.
In practice, when there is coded data to be decoded, the decoding device acquires the coded data, then perform entropy decoding on the coded data, and perform inverse quantization on an entropy decoding result to acquire the coded data of the current block.
In step 1302, the decoding device acquires a transform pair index from the coded data, and acquires an intra prediction mode of the current block or shape information of the current block.
In practice, the decoding device acquires the transform pair index corresponding to the current block from the coded data, determine the number of pixels included in the current block in a height direction (i.e., height) and determine the number of pixels included in the current block in a width direction (i.e., width). Then the decoding device determines sizes of the height and width of the current block, that is, acquires the shape information of the current block.
Alternatively, the decoding device acquires the transform pair index corresponding to the current block from the coded data, and acquires a mode number of the intra prediction mode by parsing an identification bit of the intra prediction mode in the coded data.
In an example embodiment of the present disclosure, a binarized codeword corresponding to the transform pair index includes a plurality of bits, wherein a first bit is decoded by CABAC and at least one of bits, except the first bit, among the plurality of bits is decoded by a bypass binary arithmetic coding. In this way, since at a coding device, some bits are decoded by the bypass binary arithmetic coding mode when the transform pair index include the plurality of bits, and there is no need to store a context mode. Thus, the memory space is saved. In this way, the decoding device needs to decode the bits coded by the bypass binary arithmetic coding by a bypass binary arithmetic coding and also does not need to store the context model. Thus, the memory space can also be saved.
In an example embodiment of the present disclosure, the first bit of the transform pair index is decoded based on one context model. In this way, as the decoding device only adopts the one context model, only one context model is stored and the occupied memory space is relatively small.
In an example embodiment of the present disclosure, before step 1302 is performed, it is determined that the height and width of the current block both are less than or equal to a target value and the current block is a luma block.
In practice, the target value is preset and stored to the decoding device and is generally N (N is 32). When the height and width of the current block both are less than or equal to N and the current block is the luma block, step 1302 is performed only. Otherwise, it is determined that the transform pair corresponding to the current block is (DCT2, DCT2), and subsequently this transform pair is used directly for decoding.
In addition, before step 1302 is performed, whether a target flag is carried in the coded data may also be determined, and the target flag indicates that explicit multi-kernel transform processing is performed. If the target flag is carried in the coded data, then the explicit multi-kernel transform processing is enabled, and the step 1302 is performed.
In step 1303, the decoding device determines, based on the intra prediction mode of the current block and the transform pair index, a transform pair corresponding to the current block; alternatively, the decoding device determines, based on the shape information of the current block and the transform pair index, the transform pair corresponding to the current block, wherein the transform pair includes a horizontal transform kernel and a vertical transform kernel.
In practice, the transform pair includes the horizontal transform kernel and the vertical transform kernel. When the transform pair determined in step 1302 is not any one of (DCT2, DCT2), (DST7, DST7), and (DCT8, DCT8), a preset syntax table of an explicit multi-kernel transform pair (as shown in Table 7) is acquired, and the intra prediction mode of the current block and the transform pair corresponding to the transform pair index, i.e., the transform pair corresponding to the current block, are determined from Table 7. For example, where a mode number of the intra prediction mode corresponding to the current block is 32 and the transform pair index is 3, the transform pair is (DST7, DCT8).
Alternatively, where the transform pair determined in step 1302 is not any one of (DCT2, DCT2), (DST7, DST7), and (DCT8, DCT8), the preset syntax table of the explicit multi-kernel transform pair (as shown in Table 8) is acquired, and the height and width of the current block and the transform pair corresponding to the transform pair index, i.e., the transform pair corresponding to the current block, is determined from Table 8. For example, where the width of the current block is greater than its height and the transform pair index is 3, the transform pair is (DST7, DCT8).
In an example embodiment of the present disclosure, the transform pair is determined based on the intra prediction mode and the transform pair index by the following fashion.
It is determined that the transform pair corresponding to the current block is a first transform pair if the transform pair index is a first index and the mode number of the intra prediction mode of the current block is less than or equal to a preset value. It is determined that the transform pair corresponding to the current block is a second transform pair if the transform pair index is the first index and the mode number of the intra prediction mode of the current block is greater than the preset value. It is determined that the transform pair corresponding to the current block is the second transform pair if the transform pair index is a second index and the mode number of the intra prediction mode of the current block is less than or equal to the preset value. It is determined that the transform pair corresponding to the current block is the first transform pair if the transform pair index is the second index and the mode number of the intra prediction mode of the current block is greater than the preset value.
In practice, where the transform pair determined in step 1302 is not any one of (DCT2, DCT2), (DST7, DST7), and (DCT8, DCT8), the preset syntax table of the explicit multi-kernel transform pair (as shown in Table 7) is acquired, and the transform pair corresponding to the current block is determined from Table 7.
The first index is 3, the preset value is 34, the first transform pair is (DST7, DCT8), the second index is 4, the preset value is 34, and the second transform pair is (DCT8, DST7).
In an example embodiment of the present disclosure, the transform pair is determined based on the height and width of the current block and the transform pair index by the following fashion.
It is determined that the transform pair corresponding to the current block is a first transform pair if the transform pair index is a first index and shape information of the current block satisfies a preset shape constraint condition. It is determined that the transform pair corresponding to the current block is a second transform pair if the transform pair index is the first index and the shape information of the current block does not satisfy the preset shape constraint condition. It is determined that the transform pair corresponding to the current block is the second transform pair if the transform pair index is a second index and the shape information of the current block satisfies the preset shape constraint condition. It is determined that the transform pair corresponding to the current block is the first transform pair if the transform pair index is the second index and the shape information of the current block does not satisfy the preset shape constraint condition.
In practice, where the transform pair determined in step 1302 is not any one of (DCT2, DCT2), (DST7, DST7), and (DCT8, DCT8), the preset syntax table of the explicit multi-kernel transform pair (as shown in Table 8) is acquired, and the transform pair corresponding to the current block is determined from Table 8.
The first index is 3, W≥H, the first transform pair is (DST7, DCT8), the second index is 4, W≥H, and the second transform pair is (DCT8, DST7).
In step 1304, the decoding device acquires reconstruction information corresponding to the current block by decoding the current block based on the transform pair.
In practice, in response to determining a target transform pair corresponding to the current block, the decoding device acquires a residual signal corresponding to the current block by performing reverse transform processing on a quantization coefficient corresponding to the current block with the target transform pair. Afterwards, the decoding device constructs a prediction signal in the used intra prediction mode with pixel values of pixels in a region which has been reconstructed around the current block, and then acquires the reconstruction information corresponding to the current block by adding the residual signal and the prediction signal.
Thus, the decoding of the current block is completed. Each current block is processed according to the flowchart shown in
It should be noted that in the embodiment of the present disclosure, due to adaptive adjustment of the priority of the transform pair based on the intra prediction mode or the shape information of the current block, as far as possible, the higher the probability of the transform pair is, the shorter the binarized codeword that needs to be coded is.
In the embodiment of the present disclosure, when performing coding, the coding device selects the transform pair based on the intra prediction mode of the current block or the shape information of the current block rather than a preset transform pair, and correspondingly the intra prediction mode of the current block or the shape information of the current block rather than the preset transform pair is adopted when decoding is performed. Thus, the coding and decoding performance can be improved.
Based on the same technical concept, an embodiment of the present disclosure further provides a decoding device. As shown in
In an example embodiment of the present disclosure, the binarized codeword corresponding to the transform pair index includes a plurality of bits, wherein at least one of bits, except the first bit, among the plurality of bits is decoded by bypass binary arithmetic coding.
In an example embodiment of the present disclosure, the determining module 1420 is configured to: determine, based on an intra prediction mode of the current block and the transform pair index, the transform pair corresponding to the current block; or determine, based on shape information of the current block and the transform pair index, the transform pair corresponding to the current block.
Based on the same technical concept, an embodiment of the present disclosure further provides a coding device. As shown in
In an example embodiment of the present disclosure, the coding module 1520 is configured to: code the first bit by the context-based adaptive binary arithmetic coding based on one context model and code at least one of bits, except the first bit, among a plurality of bits by bypass binary arithmetic coding if the binarized codeword corresponding to the transform pair index includes the plurality of bits; and add the coded transform pair index to the coded data.
In an example embodiment of the present disclosure, the determining module 1510 is configured to: determine the transform pair corresponding to the current block, and determine, based on an intra prediction mode of the current block and the transform pair, the transform pair index corresponding to the current block; or determine the transform pair corresponding to the current block, and determine, based on shape information of the current block and the transform pair, the transform pair index corresponding to the current block.
In the embodiment of the present disclosure, when coding the current block, the coding device acquires the transform pair corresponding to the current block for coding, rather than directly acquiring a preset transform pair, such that the coding and decoding performance can be improved. When coding the binarized codeword corresponding to the transform pair index, the coding device codes the first bit in the binarized codeword corresponding to the transform pair index by one context model rather than a plurality of context models, such that the memory space can be saved. In addition, because the plurality of context models are not needed, the context does not need to be updated, which lowers the coding and decoding complexity.
Based on the same technical concept, an embodiment of the present disclosure further provides a decoding device. As shown in
In an example embodiment of the present disclosure, a first bit of the binarized codeword corresponding to the transform pair index is decoded by context-based adaptive binary arithmetic coding based on one context model.
In an example embodiment of the present disclosure, the determining module 1620 is configured to: determine, based on an intra prediction mode of the current block and the transform pair index, the transform pair corresponding to the current block; or determine, based on shape information of the current block and the transform pair index, the transform pair corresponding to the current block, wherein the transform pair includes a horizontal transform kernel and a vertical transform kernel.
Based on the same technical concept, an embodiment of the present disclosure further provides a coding device. As shown in
In an example embodiment of the present disclosure, the coding module 1720 is further configured to: code the first bit by context-based adaptive binary arithmetic coding based on one context model.
In an example embodiment of the present disclosure, the determining module 1710 is configured to: determine the transform pair corresponding to the current block, and determine, based on an intra prediction mode of the current block and the transform pair, the transform pair index corresponding to the current block; or determine the transform pair corresponding to the current block, and determine, based on shape information of the current block and the transform pair, the transform pair index corresponding to the current block.
In the embodiment of the present disclosure, when coding the current block, the coding device acquires the transform pair corresponding to the current block for coding, rather than directly acquiring a preset transform pair, such that the coding and decoding performance can be improved. When coding the transform pair index, the coding device codes the at least one of bits, except the first bit, in the binarized codeword corresponding to the transform pair index by the bypass binary arithmetic coding, and for the at least one bit, there is no need to store the context model. Thus, the memory space can be saved. In addition, for the at least one bit, the bypass binary arithmetic coding rather than CABAC is adopted, such that the coding and decoding complexity can also be lowered.
Based on the same technical concept, an embodiment of the present disclosure further provides a decoding device. As shown in
In an example embodiment of the present disclosure, the determining module 1820 is configured to: determine that the transform pair corresponding to the current block is a first transform pair if the transform pair index is a first index and a mode number of the intra prediction mode of the current block is less than or equal to a preset value; determine that the transform pair corresponding to the current block is a second transform pair if the transform pair index is a first index and the mode number of the intra prediction mode of the current block is greater than a preset value; determine that the transform pair corresponding to the current block is a second transform pair if the transform pair index is a second index and the mode number of the intra prediction mode of the current block is less than or equal to a preset value; or determine that the transform pair corresponding to the current block is a first transform pair if the transform pair index is a second index and the mode number of the intra prediction mode of the current block is greater than a preset value.
In an example embodiment of the present disclosure, the determining module 1820 is configured to: determine that the transform pair corresponding to the current block is a first transform pair if the transform pair index is a first index and the shape information of the current block satisfies a preset shape constraint condition; determine that the transform pair corresponding to the current block is a second transform pair if the transform pair index is a first index and the shape information of the current block does not satisfy a preset shape constraint condition; determine that the transform pair corresponding to the current block is a second transform pair if the transform pair index is a second index and the shape information of the current block satisfies a preset shape constraint condition; or determine that the transform pair corresponding to the current block is a first transform pair if the transform pair index is a second index and the shape information of the current block does not satisfy a preset shape constraint condition.
In an example embodiment of the present disclosure, wherein a first bit of a binarized codeword corresponding to the transform pair index is decoded by context-based adaptive binary arithmetic coding based on one context model.
In an example embodiment of the present disclosure, the binarized codeword corresponding to the transform pair index includes a plurality of bits, wherein at least one of bits, except the first bit, among the plurality of bits is decoded by a bypass binary arithmetic coding.
Based on the same technical concept, an embodiment of the present disclosure further provides a coding device. As shown in
In an example embodiment of the present disclosure, the determining module 1920 is configured to: determine that the transform pair index corresponding to the current block is a first index if the transform pair is a first transform pair and a mode number of the intra prediction mode of the current block is less than or equal to a preset value; determine that the transform pair index corresponding to the current block is a first index if the transform pair is a second transform pair and a mode number of the intra prediction mode of the current block is greater than a preset value; determine that the transform pair index corresponding to the current block is a second index if the transform pair is a second transform pair and a mode number of the intra prediction mode of the current block is less than or equal to a preset value; or determine that the transform pair index corresponding to the current block is a second index if the transform pair is a first transform pair and a mode number of the intra prediction mode of the current block is greater than a preset value.
In an example embodiment of the present disclosure, the determining module 1920 is configured to: determine that the transform pair index corresponding to the current block is a first index if the transform pair is a first transform pair and the shape information of the current block satisfies a preset shape constraint condition; determine that the transform pair index corresponding to the current block is a first index if the transform pair is a second transform pair and the shape information of the current block does not satisfy a preset shape constraint condition; determine that the transform pair index corresponding to the current block is a second index if the transform pair is a second transform pair and the shape information of the current block satisfies a preset shape constraint condition; or determine that the transform pair index corresponding to the current block is a second index if the transform pair is a first transform pair and the shape information of the current block does not satisfy a preset shape constraint condition.
In an example embodiment of the present disclosure, the coding module 1930 is configured to: code the transform pair index by subjecting a first bit of a binarized codeword corresponding to the transform pair index to context-based adaptive binary arithmetic coding based on one context model.
In an example embodiment of the present disclosure, the coding module 1930 is configured to: code at least one of bits, except the first bit, among a plurality of bits by bypass binary arithmetic coding if a binarized codeword corresponding to the transform pair index includes the plurality of bits.
In the embodiment of the present disclosure, when performing coding, the coding device selects the transform kernel based on the intra prediction mode of the current block or the shape information of the current block rather than a preset transform pair, and correspondingly the intra prediction mode of the current block or the shape information of the current block rather than the preset transform pair is also adopted when decoding is performed. Thus, the coding and decoding performance can be improved.
It should be noted that the decoding device according to the above embodiment only takes division of all the functional modules as an example for explanation when performing decoding. In practice, the functions can be implemented by different functional modules as required. That is, the decoding device includes different functional modules to implement all or part of the functions described above. In addition, the decoding device according to the above embodiment and the decoding method are based on the same concept, and a specific implementation process of the decoding device is detailed in the method embodiment and is not repeatedly described here.
It should be noted that the coding device according to the above embodiment only takes division of all the functional modules as an example for explanation when performing coding. In practice, the functions can be implemented by different functional modules as required. That is, the coding device includes different functional modules to implement all or part of the functions described above. In addition, the coding device according to the above embodiment and the coding method are based on the same concept, and a specific implementation process of the coding device is detailed in the method embodiment and is not repeatedly described here.
An embodiment of the present disclosure further provides a computer-readable storage medium. The storage medium stores a computer program therein. The computer program, when running by a processor, causes the processor to perform steps of the coding method and decoding method described above.
An embodiment of the present disclosure further provides a coding device. The coding device includes a processor and a memory. The memory is configured to a store a computer program. The processor, when running the program, is caused to perform the steps of the coding method described above.
An embodiment of the present disclosure further provides a decoding device. The decoding device includes a processor and a memory. The memory is configured to a store a computer program. The processor, when running the program stored, is caused to perform the steps of the decoding method described above.
An embodiment of the present disclosure further provides a coding and decoding system. The system includes a coding device and a decoding device.
The coding device is the coding device as described above.
The decoding device is the decoding device as described above.
A person of ordinary skill in the art may understand that all or part of the steps in the above embodiments may be performed by hardware, or relevant hardware instructed by a program, and the program may be stored in a computer-readable storage medium, such as a read-only memory, a disk, or an optical disc.
Described above are only preferred embodiments of the present disclosure, but are not intended to limit the present disclosure. Any modifications, equivalent replacements, improvements, and the like made within the spirit and principles of the present disclosure should be included within the scope of protection of the present disclosure.
Number | Date | Country | Kind |
---|---|---|---|
201910177580.4 | Mar 2019 | CN | national |
This application is a continuation of U.S. patent application Ser. No. 17/432,887, filed on Aug. 20, 2021, which is a U.S. National Phase application of International Application No. PCT/CN2020/078486, filed on Mar. 9, 2020, which claims priority to Chinese Patent Application No. 201910177580.4, filed on Mar. 9, 2019 and entitled “METHODS FOR PERFORMING ENCODING AND DECODING, DECODING END AND ENCODING END,” the contents of each of which is incorporated herein by reference in its entirety.
Number | Date | Country | |
---|---|---|---|
Parent | 17432887 | Aug 2021 | US |
Child | 18767880 | US |