The exemplary embodiments relate to encoding and decoding a video.
As hardware for reproducing and storing high resolution or high quality video content is being developed and supplied, a need for a video codec for effectively encoding or decoding the high resolution or high quality video content is increasing. In a related art video codec, a video is encoded according to a limited encoding method based on a macroblock having a predetermined size.
A prediction encoding method based on a macroblock may generate a blocking effect due to discontinuous pixel values at boundaries of blocks. Accordingly, in a video codec, deblocking filtering is performed to improve video compressibility and quality of a restored image.
Aspects of one or more exemplary embodiments provide a method of performing deblocking filtering in a video codec by using a coding unit overcoming limits of a related art macroblock-based encoding method.
Aspects of one or more exemplary embodiments also provide a method and apparatus for performing deblocking filtering, which reduce a deblocking effect generated in a boundary region of coding units in a video encoded based on a tree-structured coding unit.
According to an aspect of an exemplary embodiment, there is provided a method of encoding a video, which performs deblocking filtering based on coding units, the method including: splitting one picture into at least one maximum coding unit that is a data unit having a maximum size; determining a plurality of coding units that are hierarchically configured according to depths indicating a number of times the at least one maximum coding unit is spatially spilt, and a plurality of prediction units and a plurality of transformation units respectively for prediction and transformation of the coding units; determining a filtering boundary on which deblocking filtering is to be performed based on at least one data unit from among the plurality of coding units, plurality of prediction units, and the plurality of transformation units; determining filtering strength at the filtering boundary based on a prediction mode of a coding unit to which pixels adjacent to the filtering boundary belong from among the plurality of coding units, and transformation coefficient values of the pixels adjacent to the filtering boundary; and performing deblocking filtering on the filtering boundary based on the determined filtering strength.
According to an aspect of another exemplary embodiment, there is provided a method of decoding a video, which performs deblocking filtering based on coding units, the method including: extracting image data encoded according to a plurality of coding units, encoding mode information about the plurality of coding units encoded according to a tree structure, and information about deblocking filtering in a maximum coding unit, according to the coding units encoded according to the tree structure included in each maximum coding unit obtained by splitting a picture, by parsing a received bitstream; determining a plurality of prediction units and a plurality of transformation units for prediction and transformation according to the plurality of coding units and decoding the encoded image data, based on the encoding mode information about the a plurality of coding units encoded according to the tree structure; determining a filtering boundary to which deblocking filtering is to be performed from among boundaries of at least one data unit from among the plurality of coding units encoded according to the tree structure, the plurality of prediction units, and the plurality of transformation units, by using the information about the deblocking filtering; determining filtering strength of the filtering boundary based on a prediction mode of a coding unit to which adjacent pixels to the determined filtering boundary belong from among the plurality of coding units and transformation coefficient values of pixels adjacent to the filtering boundary; and performing deblocking filtering on the decoded image data based on the determined filtering strength.
According to an aspect of another exemplary embodiment, there is provided an apparatus for encoding a video, which performs deblocking filtering based on coding units, the apparatus including: a coding unit determiner for determining a plurality of coding units that are hierarchically configured according to depths indicating a number of times at least one maximum coding unit is spatially split, wherein the maximum coding unit is a data unit having a maximum size that is spilt to encode one picture, and a plurality of prediction units and a plurality of transformation units respectively for prediction and transformation of the plurality of coding units; a deblocking filtering unit for determining a filtering boundary on which deblocking filtering is to be performed based on at least one data unit from among the plurality of coding units, the plurality of prediction units, and the plurality of transformation units, determining filtering strength at the filtering boundary based on a prediction mode of a coding unit to which pixels adjacent to the filtering boundary belong from among the plurality of coding units, and transformation coefficient values of the pixels adjacent to the filtering boundary, and performing deblocking filtering on the filtering boundary based on the determined filtering strength; and a transmitter for encoding information about the deblocking filtering and transmitting the information with encoding data of the one picture and encoding mode information about the coding units according to the tree structure.
According to an aspect of another exemplary embodiment, there is provided an apparatus for decoding a video, which performs deblocking filtering based on coding units, the apparatus including: receiving and extracting unit for extracting image data encoded according to a plurality of coding units, encoding mode information about the plurality of coding units encoded according to a tree structure, and information about deblocking filtering in a maximum coding unit, according to the coding units encoded according to the tree structure included in each maximum coding unit obtained by splitting a current picture, by parsing a received bitstream; a decoder for determining a plurality of prediction units and a plurality of transformation units for prediction and transformation according to the plurality of coding units and decoding the encoded image data, based on the encoding mode information about the plurality of coding units encoded according to the tree structure; and a deblocking filtering unit for determining a filtering boundary to which deblocking filtering is to be performed from among boundaries of at least one data unit from among the plurality of coding units encoded according to the tree structure, the plurality of prediction units, and the plurality of transformation units, by using the information about the deblocking filtering, determining filtering strength of the filtering boundary based on a prediction mode of a coding unit to which pixels adjacent to the determined filtering boundary belong from among the plurality of coding units and transformation coefficient values of pixels adjacent to the filtering boundary, and performing deblocking filtering on the decoded image data based on the determined filtering strength.
According to aspects of one or more exemplary embodiments, an, the quality of a compressed and restored video may be remarkably improved by removing a deblocking effect from the compressed and restored video based on a tree-structured coding unit.
The above and other features and advantages will become more apparent by describing in detail exemplary embodiments with reference to the attached drawings in which:
Hereinafter, exemplary embodiments will be described more fully with reference to the accompanying drawings, in which like reference numerals correspond to like elements throughout.
The video encoding apparatus 100 includes a coding unit determiner 110, a deblocking filtering unit 130, and a transmitter 120.
The coding unit determiner 110 receives image data of one picture of a video and splits the picture into at least one maximum coding unit that is a data unit having a maximum size. The maximum coding unit according to an exemplary embodiment may be a data unit having a size of 32×32, 64×64, 128×128, 256×256, etc., wherein a shape of the data unit is a square having a width and length in squares of 2 that is higher than 8.
The coding unit determiner 110 determines coding units having a hierarchical structure according to regions spatially spilt per maximum coding unit. The coding units may be expressed based on a depth indicating a number of times the maximum coding unit is spatially split. In detail, coding units according to a tree structure include coding units corresponding to a depth determined to be a coded depth, from among all deeper coding units according to depths included in the maximum coding unit. A coding unit of a coded depth may be hierarchically determined according to depths in the same region of the maximum coding unit, and may be independently determined in different regions.
The coding unit determiner 110 may encode each deeper coding unit included in a current maximum coding unit, and determine a coding unit for outputting an optimum encoding result and a coded depth that is a corresponding depth by comparing encoding results of coding units of an upper depth and a lower depth according to regions. Also, a coded depth of a current region may be independently determined from a coded depth of another region.
Accordingly, the coding unit determiner 110 may determine coding units according to a tree structure in a coded depth independently determined according to regions per maximum coding unit. Also, the coding unit determiner 110 performs prediction encoding while determining the coding units of the coded depth. The coding unit determiner 110 may determine a prediction unit or a partition that is a data unit for performing prediction encoding to output an optimum encoding result in the coding unit of the coded depth. For example, examples of a partition type with respect to a coding unit having a size of 2N×2N (where N is a positive integer) may include partitions having sizes 2N×2N, 2N×N, N×2N, or N×N. Examples of the partition type include symmetrical partitions that are obtained by symmetrically splitting a height or width of the coding unit, partitions obtained by asymmetrically splitting the height or width of the coding unit, such as 1:n or n:1, partitions that are obtained by geometrically splitting the prediction unit, and partitions having arbitrary shapes. Also, a prediction mode of the partition type may be an inter mode, an intra mode, a skip mode, or the like.
A coding unit according to an exemplary embodiment may be characterized by a maximum size and a depth. The depth denotes a number of times the coding unit is hierarchically split from the maximum coding unit, and as the depth increases, deeper encoding units according to depths may be split from the maximum coding unit to a minimum coding unit. A depth of the maximum coding unit is an uppermost depth and a depth of the minimum coding unit is a lowermost depth. Since a size of a coding unit corresponding to each depth decreases as the depth of the maximum coding unit increases, a coding unit corresponding to an upper depth may include a plurality of coding units corresponding to lower depths.
A maximum depth according to an exemplary embodiment is an index related to the number of splitting times from a maximum coding unit to a minimum coding unit or the number of times a maximum coding unit is split to arrive at the minimum coding unit. A first maximum depth according to an exemplary embodiment may denote the total number of splitting times from the maximum coding unit to the minimum coding unit. A second maximum depth according to an exemplary embodiment may denote the total number of depth levels from the maximum coding unit to the minimum coding unit. For example, when a depth of the maximum coding unit is 0, a depth of a coding unit, in which the maximum coding unit is split once, may be set to 1, and a depth of a coding unit, in which the maximum coding unit is split twice, may be set to 2. Here, if the minimum coding unit is a coding unit in which the maximum coding unit is split four times, 5 depth levels of depths 0, 1, 2, 3 and 4 exist, and thus the first maximum depth may be set to 4, and the second maximum depth may be set to 5.
Coding units according to a tree structure in a maximum coding unit and a method of determining a partition, according to aspects of exemplary embodiments, will be described in detail later with reference to
The deblocking filtering unit 130 determines a filtering boundary to which deblocking filtering is to be performed based on at least one data unit from among coding units, prediction units, and transformation units, and determines filtering strength in the filtering boundary based on a prediction mode of a coding unit to which adjacent pixels belong based on the determined filtering boundary and transformation coefficient values of pixels adjacent to the filtering boundary, and performs deblocking filtering based on the filtering strength. For example, when the coding units, prediction units, and transformation units are determined as will be described below, the deblocking filtering unit 130 may determine a boundary of data units having a predetermined size or above as the filtering boundary to which the deblocking filtering is to be performed based on sizes of the coding units, prediction units, and transformation units, and perform the deblocking filtering on the pixels adjacent to the filtering boundary.
The transmitter 120 may encode information about the deblocking filtering determined by the deblocking filtering unit 130, and transmit the information along with encoded data of the picture and encoding mode information about the coding units according to the tree structure of the maximum coding unit. The information about the deblocking filtering may include filtering boundary determination information, such as a size of a data unit for determining a data unit for performing the deblocking filtering, from among boundaries of data units, such as the coding units, the prediction units, and the transformation units.
The transmitter 120 may insert and transmit the information about the deblocking filtering into a sequence parameter set (SPS) or a picture parameter set (PPS) of the picture.
A process of determining a filtering boundary for deblocking filtering and a deblocking filtering process according to aspects of exemplary embodiments will be described in detail later with reference to
The coding unit determiner 110 may determine a coding unit having an optimum shape and size per maximum coding unit, based on a size and maximum depth of a maximum coding unit determined considering characteristics of a current picture. Also, since encoding may be performed by using any one of various prediction modes and transformation methods per maximum coding unit, an optimum encoding mode may be determined considering image characteristics of coding units of various image sizes.
If an image having high resolution or large data amount is encoded in a macroblock unit having a fixed size of 16×16 or 8×8, a number of macroblocks per picture excessively increases. Accordingly, a number of pieces of compressed information generated for each macroblock increases, and thus it is difficult to transmit the compressed information and data compression efficiency decreases. However, by using the coding unit determiner 110, final compression efficiency of a video may be increased since a coding unit is adjusted while considering characteristics of an image and increasing a maximum size of a coding unit while considering a size of the image.
Also, prediction encoding having a reduced error with an original picture may be performed by using a reference picture that is deblocking filtered, via deblocking filtering based on coding units according to a tree structure.
The video decoding apparatus 200 includes a receiving and extracting unit 210, a decoder 220, and a deblocking filtering unit 230.
The receiving and extracting unit 210 extracts image data encoded using coding units according to a tree structure, encoding mode information about the coding units, and information about deblocking filtering, according to maximum coding units, by receiving and parsing a bitstream about a video. The receiving and extracting unit 210 may extract the information about the deblocking filtering from an SPS or PPS of a picture.
The decoder 220 decodes the image data encoded according to the coding units, based on the encoding mode information about the coding units according to the tree structure extracted by the receiving and extracting unit 210.
The decoder 220 may determine a coding unit of a coded depth included in a maximum coding unit, a partition type, a prediction mode, and a transformation unit of the coding unit, based on the encoding mode information about the coding units according to the tree structure according to the maximum coding units.
The decoder 220 may decode encoded image data of a maximum coding unit by decoding the encoded image data based on the determined partition type, prediction mode, and transformation unit per coding unit from among the coding units according to the tree structure included in the maximum coding unit.
The image data decoded by the decoder 220 and the information about the deblocking filtering extracted by the receiving and extracting unit 210 are input to the deblocking filtering unit 230.
The deblocking filtering unit 230 determines a filtering boundary to which deblocking filtering is to be performed from among boundaries of at least one data unit from among coding units according to a tree structure, prediction units, and transformation units by using the information about the deblocking filtering. The deblocking filtering unit 230 determines filtering strength in the filtering boundary according to a prediction mode of a coding unit to which adjacent pixels belong based on the filtering boundary and transformation coefficient values of pixels adjacent to the filtering boundary, and performs deblocking filtering on the decoded image data based on the filtering strength.
By using the deblocking filtering unit 230, an error between a restored image and an original image may be reduced since prediction decoding is performed on a following picture by referring to a reference picture to which deblocking filtering is performed.
A size of a coding unit may be expressed in width×height, and may be 64×64, 32×32, 16×16, and 8×8. A coding unit of 64×64 may be split into prediction units of 64×64, 64×32, 32×64, or 32×32, and a coding unit of 32×32 may be split into prediction units of 32×32, 32×16, 16×32, or 16×16, a coding unit of 16×16 may be split into prediction units of 16×16, 16×8, 8×16, or 8×8, and a coding unit of 8×8 may be split into prediction units of 8×8, 8×4, 4×8, or 4×4.
In video data 310, a resolution is 1920×1080, a maximum size of a coding unit is 64, and a maximum depth is 2. In video data 320, a resolution is 1920×1080, a maximum size of a coding unit is 64, and a maximum depth is 3. In video data 330, a resolution is 352×288, a maximum size of a coding unit is 16, and a maximum depth is 1. The maximum depth shown in
If a resolution is high or a data amount is large, a maximum size of a coding unit may be large so as to not only increase encoding efficiency but also to accurately reflect characteristics of an image. Accordingly, the maximum size of the coding unit of the video data 310 and 320 having the higher resolution than the video data 330 may be 64.
Since the maximum depth of the video data 310 is 2, coding units 315 of the vide data 310 may include a maximum coding unit having a long axis size of 64, and coding units having long axis sizes of 32 and 16 since depths are deepened to two layers by splitting the maximum coding unit twice. Meanwhile, since the maximum depth of the video data 330 is 1, coding units 335 of the video data 330 may include a maximum coding unit having a long axis size of 16, and coding units having a long axis size of 8 since depths are deepened to one layer by splitting the maximum coding unit once.
Since the maximum depth of the video data 320 is 3, coding units 325 of the video data 320 may include a maximum coding unit having a long axis size of 64, and coding units having long axis sizes of 32, 16, and 8 since the depths are deepened to 3 layers by splitting the maximum coding unit three times. As a depth increases, detailed information may be precisely expressed.
The image encoder 400 may correspond to the video encoding apparatus 100. In other words, an intra predictor 410 performs intra prediction on coding units in an intra mode, from among a current frame 405, and a motion estimator 420 and a motion compensator 425 perform inter estimation and motion compensation on coding units in an inter mode from among the current frame 405 by using the current frame 405, and a reference frame 495.
Data output from the intra predictor 410, the motion estimator 420, and the motion compensator 425 is output as a quantized transformation coefficient through a transformer 430 and a quantizer 440. The quantized transformation coefficient is restored as data in a spatial domain through an inverse quantizer 460 and an inverse transformer 470, and the restored data in the spatial domain is output as the reference frame 495 after being post-processed through a deblocking unit 480 and a loop filtering unit 490. The quantized transformation coefficient may be output as a bitstream 455 through an entropy encoder 450.
The intra predictor 410, the motion estimator 420, the motion compensator 425, the transformer 430, the quantizer 440, the entropy encoder 450, the inverse quantizer 460, the inverse transformer 470, the deblocking unit 480, and the loop filtering unit 490 of the image encoder 400 may operate considering the coding units according to the tree structure and maximum coding units.
Specifically, the deblocking unit 480 determines a filtering boundary to which deblocking filtering is to be performed based on a maximum size of a coding unit and coding units according to a tree structure, determines filtering strength in the filtering boundary according to a prediction mode of a coding unit to which adjacent pixels belong based on the filtering boundary and transformation coefficient values of pixels adjacent to the filtering boundary, and performs deblocking filtering based on the filtering strength.
A parser 510 parses encoded image data to be decoded and information about encoding required for decoding from a bitstream 505. The encoded image data is output as inverse quantized data through an entropy decoder 520 and an inverse quantizer 530, and the inverse quantized data is restored to image data in a spatial domain through an inverse transformer 540.
An intra predictor 550 performs intra prediction on coding units in an intra mode with respect to the image data in the spatial domain, and a motion compensator 560 performs motion compensation on coding units in an inter mode by using a reference frame 585.
The image data in the spatial domain, which passed through the intra predictor 550 and the motion compensator 560, may be output as a restored frame 595 after being post-processed through a deblocking unit 570 and a loop filtering unit 580. Also, the image data that is post-processed through the deblocking unit 570 and the loop filtering unit 580 may be output as the reference frame 585.
In order to decode the image data in the image data decoder 230 of the video decoding apparatus 200, the image decoder 500 may perform operations that are performed after the parser 510.
Since the image decoder 500 corresponds to the video decoding apparatus, the parser 510, the entropy decoder 520, the inverse quantizer 530, the inverse transformer 540, the intra predictor 550, the motion compensator 560, the deblocking unit 570, and the loop filtering unit 580 of the image decoder 500 perform operations based on coding units having a tree structure for each maximum coding unit.
Specifically, the deblocking unit 570 determines a filtering boundary to which deblocking filtering is to be performed from among boundaries of at least one data unit from among coding units according to a tree structure, prediction units, and transformation units by using parsed information about deblocking filtering. The deblocking unit 570 determines filtering strength in the filtering boundary according to a prediction mode of a coding unit to which adjacent pixels belong based on the filtering boundary and transformation coefficient values of pixels adjacent to the filtering boundary, and performs deblocking filtering with respect to image data decoded based on the filtering strength. Detailed operations about the deblocking filtering will be described in detail later with reference to
The video encoding apparatus 100 and the video decoding apparatus 200 use coding units according to a tree structure, which are independently determined according to regions, so as to consider characteristics of an image. A maximum height, a maximum width, and a maximum depth of coding units may be adaptively determined according to the characteristics of the image, or may be differently set by a user. Sizes of deeper coding units according to depths may be determined according to the predetermined maximum size of the coding unit.
In a hierarchical structure 600 of coding units, according to an exemplary embodiment, the maximum height and the maximum width of the coding units are each 64, and the maximum depth is 5. The maximum depth shown in
Since a depth increases along a vertical axis of the hierarchical structure 600, a height and a width of the deeper coding unit are each split. Also, a prediction unit or partitions, which are used for prediction encoding of each deeper coding unit, are shown along a horizontal axis of the hierarchical structure 600.
In other words, a coding unit 610 is a maximum coding unit in the hierarchical structure 600, wherein a depth is 0 and a size, i.e., a height by width, is 64×64. The depth increases along the vertical axis, and a coding unit 620 having a size of 32×32 and a depth of 1, a coding unit 630 having a size of 16×16 and a depth of 2, a coding unit 640 having a size of 8×8 and a depth of 3, and a coding unit 650 having a size of 4×4 and a depth of 4 exist. The coding unit 650 having the size of 4×4 and the depth of 4 is a minimum coding unit.
The partitions are arranged as prediction units of coding units, along the horizontal axis according to each depth. In other words, a prediction unit of the coding unit 610 having the size of 64×64 and the depth of 0 includes a partition 610 having a size of 64×64, partitions 612 having the size of 64×32, partitions 614 having the size of 32×64, or partitions 616 having the size of 32×32. In other words, the coding unit 610 may be a square data unit having a minimum size including the partitions 610, 612, 614, and 616.
Similarly, a prediction unit of the coding unit 620 having the size of 32×32 and the depth of 1 may include a partition 620 having a size of 32×32, partitions 622 having a size of 32×16, partitions 624 having a size of 16×32, and partitions 626 having a size of 16×16.
Similarly, a prediction unit of the coding unit 630 having the size of 16×16 and the depth of 2 may include a partition having a size of 16×16 included in the coding unit 630, partitions 632 having a size of 16×8, partitions 634 having a size of 8×16, and partitions 636 having a size of 8×8.
Similarly, a prediction unit of the coding unit 640 having the size of 8×8 and the depth of 3 may include a partition having a size of 8×8 included in the coding unit 640, partitions 642 having a size of 8×4, partitions 644 having a size of 4×8, and partitions 646 having a size of 4×4.
The coding unit 650 having the size of 4×4 and the depth of 4 is the minimum coding unit and a coding unit of the lowermost depth. A prediction unit of the coding unit 650 may include a partition 650 having a size of 4×4, partitions 652 having a size of 4×2, partitions 654 having a size of 2×4, and partitions 656 having a size of 2×2.
In order to determine the at least one coded depth of the coding units constituting the maximum coding unit 610, the coding unit determiner 120 of the video encoding apparatus 100 performs encoding for coding units corresponding to each depth included in the maximum coding unit 610.
A number of deeper coding units according to depths including data in the same range and the same size increases as the depth increases. For example, four coding units corresponding to a depth of 2 are included in data that is included in one coding unit corresponding to a depth of 1. Accordingly, in order to compare encoding results of the same data according to depths, the coding unit corresponding to the depth of 1 and four coding units corresponding to the depth of 2 are each encoded.
In order to perform encoding for a current depth from among the depths, a least encoding error may be selected for the current depth by performing encoding for each prediction unit in the coding units corresponding to the current depth, along the horizontal axis of the hierarchical structure 600. Alternatively, the minimum encoding error may be searched for by comparing the least encoding errors according to depths, by performing encoding for each depth as the depth increases along the vertical axis of the hierarchical structure 600. A depth and a prediction unit having the minimum encoding error in the coding unit 610 may be selected as the coded depth and a partition type of the coding unit 610.
The video encoding apparatus 100 or the video decoding apparatus 200 encodes or decodes an image according to coding units having sizes smaller than or equal to a maximum coding unit for each maximum coding unit. Sizes of transformation units for transformation during encoding may be selected based on data units that are not larger than a corresponding coding unit.
For example, in the video encoding apparatus 100 or the video decoding apparatus 200, if a size of the coding unit 710 is 64×64, transformation may be performed by using the transformation units 720 having a size of 32×32.
Also, data of the coding unit 710 having the size of 64×64 may be encoded by performing the transformation on each of the transformation units having the size of 32×32, 16×16, 8×8, and 4×4, which are smaller than 64×64, and then a transformation unit having the least coding error may be selected.
The video encoding apparatus 100 may encode and transmit information 800 about a partition type, information 810 about a prediction mode, and information 820 about a size of a transformation unit for each coding unit corresponding to a coded depth, as encoding mode information about coding units according to a tree structure.
The information 800 indicates information about a type a current coding unit is split as a prediction unit (partition) for prediction encoding the current coding unit. For example, a current coding unit CU_0 having a size of 2N×2N and a depth of 0 may be used as a prediction unit after being split into any one of a partition 802 having a size of 2N×2N, a partition 804 having a size of 2N×N, a partition 806 having a size of N×2N, and a partition 808 having a size of N×N. Here, the information 800 about a partition type is set to indicate one of the partition 804 having a size of 2N×N, the partition 806 having a size of N×2N, and the partition 808 having a size of N×N
The information 810 indicates a prediction mode of each coding unit. For example, the information 810 may indicate a mode of prediction encoding performed on a prediction unit indicated by the information 800, i.e., an intra mode 812, an inter mode 814, or a skip mode 816.
The information 820 indicates a transformation unit to be based on when transformation is performed on a current coding unit. For example, the transformation unit may have one of a first size 822 and a second size 824 of an intra mode, and a first size 826 and a second size 828 of an inter mode.
The receiving and extracting unit 220 of the video decoding apparatus 200 may extract the information 800, 810, and 820 according to each deeper coding unit, and the decoder 220 may use the information 800, 810, and 820 for decoding.
Split information may be used to indicate a change of a depth. The spilt information indicates whether a coding unit of a current depth is split into coding units of a lower depth.
A prediction unit 910 for prediction encoding a coding unit 900 having a depth of 0 and a size of 2N_0×2N_0 may include partitions of a partition type 912 having a size of 2N_0×2N_0, a partition type 914 having a size of 2N_0×N_0, a partition type 916 having a size of N_0×2N_0, and a partition type 918 having a size of N_0×N_0.
Prediction encoding is repeatedly performed on one prediction unit having a size of 2N_0×2N_0, two prediction units having a size of 2N_0×N_0, two prediction units having a size of N_0×2N_0, and four prediction units having a size of N_0×N_0, according to each partition type. The prediction encoding in an intra mode and an inter mode may be performed on the prediction units having the sizes of 2N_0×2N_0, N_0×2N_0, 2N_0×N_0, and N_0×N_0. The prediction encoding in a skip mode is performed only on the prediction unit having the size of 2N_0×2N_0.
If an encoding error is smallest in one of the partition types 912 through 916, the prediction unit 910 may not be split into a lower depth.
If the encoding error is the smallest in the partition type 918, a depth is changed from 0 to 1 to split the partition type 918 in operation 920, and encoding is repeatedly performed on coding units 930 having a depth of 2 and a size of N_0×N_0 to search for a minimum encoding error.
A prediction unit 940 for prediction encoding the coding unit 930 having a depth of 1 and a size of 2N_1×2N_1 (=N_0×N_0) may include partitions of a partition type 942 having a size of 2N_1×2N_1, a partition type 944 having a size of 2N_1×N_1, a partition type 946 having a size of N_1×2N_1, and a partition type 948 having a size of N_1×N_1. If an encoding error is the smallest in the partition type 948, a depth is changed from 1 to 2 to split the partition type 948 in operation 950, and encoding is repeatedly performed on coding units 960, which have a depth of 2 and a size of N_2×N_2 to search for a minimum encoding error.
When a maximum depth is d, split operation according to each depth may be performed up to when a depth becomes d−1, and split information may be encoded as up to when a depth is one of 0 to d−2. In other words, when encoding is performed up to when the depth is d−1 after a coding unit corresponding to a depth of d−2 is split in operation 970, a prediction unit 990 for prediction encoding a coding unit 980 having a depth of d−1 and a size of 2N_(d−1)×2N_(d−1) may include partitions of a partition type 992 having a size of 2N_(d−1)×2N_(d−1), a partition type 994 having a size of 2N_(d−1)×N_(d−1), a partition type 996 having a size of N_(d−1)×2N_(d−1), and a partition type 998 having a size of N_(d−1)×N_(d−1).
Prediction encoding may be repeatedly performed on one prediction unit having a size of 2N_(d−1)×2N_(d−1), two prediction units having a size of 2N_(d−1)×N_(d−1), two prediction units having a size of N_(d−1)×2N_(d−1), four prediction units having a size of N_(d−1)×N_(d−1) from among the partition types 992 through 998 to search for a partition type having a minimum encoding error.
Even when the partition type 998 has the minimum encoding error, since a maximum depth is d, a coding unit CU_(d−1) having a depth of d−1 is no longer split to a lower depth, and a coded depth for the coding units constituting a current maximum coding unit 900 is determined to be d−1 and a partition type of the current maximum coding unit 900 may be determined to be N_(d−1)×N_(d−1). Also, since the maximum depth is d and a minimum coding unit 980 having a lowermost depth of d−1 is no longer split to a lower depth, split information for the minimum coding unit 980 is not set.
A data unit 999 may be a ‘minimum unit’ for the current maximum coding unit. A minimum unit according to an exemplary embodiment may be a square data unit obtained by splitting a minimum coding unit 980 by 4, i.e., may be a square data unit having a maximum size that may be included in coding units of all coded depth, prediction units, and transformation units included in the maximum coding unit. By performing the encoding repeatedly, the video encoding apparatus 100 may select a depth having the least encoding error by comparing encoding errors according to depths of the coding unit 900 to determine a coded depth, and set a corresponding partition type and a prediction mode as an encoding mode of the coded depth.
As such, the minimum encoding errors according to depths are compared in all of the depths of 1 through d, and a depth having the least encoding error may be determined as a coded depth. The coded depth and a prediction of the coded depth may be encoded and transmitted as information about an encoding mode. Also, since a coding unit is split from a depth of 0 to a coded depth, only split information of the coded depth is set to 0, and split information of depths excluding the coded depth is set to 1.
The receiving and extracting unit 210 of the video decoding apparatus 200 may extract and use the information about the coded depth and the prediction unit of the coding unit 900 to decode the partition 912. The video decoding apparatus 200 may determine a depth, in which split information is 0, as a coded depth by using split information according to depths, and use information about an encoding mode of the corresponding depth to decode encoded data of the corresponding coding unit.
The coding units 1010 are coding units according to a tree structure determined by the video encoding apparatus 100, in a current maximum coding unit. The prediction units 1060 are prediction units of coding units of each coded depth in the coding units 1010, and the transformation units 1070 are transformation units of each of the coding units 1010.
When a depth of a maximum coding unit is 0 in the coding units 1010, the coding units 1010 include coding units 1012 and 1054 having a depth of 1, coding units 1014, 1016, 1018, 1028, 1050, and 1052 having a depth of 2, coding units 1020, 1022, 1024, 1026, 1030, 1032, and 1048 having a depth of 3, and coding units 1040, 1042, 1044, and 1046 having a depth of 4.
In the prediction units 1060, some coding units 1014, 1016, 1022, 1032, 1048, 1050, 1052, and 1054 are obtained by splitting the coding units in the coding units 1010. In other words, partition types in the coding units 1014, 1022, 1050, and 1054 have a size of 2N×N, partition types in the coding units 1016, 1048, and 1052 have a size of N×2N, and a partition type of the coding unit 1032 has a size of N×N. In other words, prediction units are smaller than or equal to each coding unit.
Transformation or inverse transformation is performed on image data of the coding unit 1052 in the transformation units 1070 in a data unit that is smaller than the coding unit 1052. Also, the coding units 1014, 1016, 1022, 1032, 1048, 1050, and 1052 in the transformation units 1070 are data units having different sizes or shapes from those in the prediction units 1060. In other words, transformation units and prediction units of one coding unit are independently determined. Accordingly, the video encoding and decoding apparatuses 100 and 200 may perform intra prediction, motion estimation, motion compensation, transformation, and inverse transformation individually on a data unit in the same coding unit.
Accordingly, encoding is recursively performed on each of coding units having a hierarchical structure in each region of a maximum coding unit to determine an optimum coding unit, and thus coding units having a recursive tree structure may be obtained.
Encoding information may include split information about a coding unit, information about a partition type, information about a prediction mode, and information about a size of a transformation unit. Table 1 shows the encoding information that may be set by the video encoding and decoding apparatuses 100 and 200.
The transmitter 120 of the video encoding apparatus 100 may output the encoding information about the coding units having a tree structure, and the receiving and extracting unit 210 of the video decoding apparatus 200 may extract the encoding information about the coding units having a tree structure from a received bitstream.
Split information indicates whether a current coding unit is split into coding units of a lower depth. If split information of a current depth d is 0, a depth, in which a current coding unit is no longer split into a lower depth, is a coded depth, and thus information about a partition type, prediction mode, and a size of a transformation unit may be defined for the coded depth. If the current coding unit is further split according to the split information, encoding is independently performed on four split coding units of a lower depth.
A prediction mode may be one of an intra mode, an inter mode, and a skip mode. The intra mode and the inter mode may be defined in all partition types, and the skip mode is defined only in a partition type having a size of 2N×2N.
The information about the partition type may indicate symmetrical partition types having sizes of 2N×2N, 2N×N, N×2N, and N×N, and asymmetrical partition types having sizes of 2N×nU, 2N×nD, nL×2N, and nR×2N, which are obtained by asymmetrically splitting the height or width of the prediction unit. The asymmetrical partition types having the sizes of 2N×nU and 2N×nD may be respectively obtained by splitting the height of the prediction unit in 1:3 and 3:1, and the asymmetrical partition types having the sizes of nL×2N and nR×2N may be respectively obtained by splitting the width of the prediction unit in 1:3 and 3:1
The size of the transformation unit may be set to be two types according to split information of the transformation unit. In other words, if the split information of the transformation unit is 0, the transformation unit having a size of 2N×2N may be set to be the size of the current coding unit. If split information of the transformation unit is 1, the transformation units may be obtained by splitting the current coding unit. Also, if a partition type of the current coding unit having the size of 2N×2N is a symmetrical partition type, a size of a transformation unit may be N×N, and if the partition type of the current coding unit is an asymmetrical partition type, the size of the transformation unit may be N/2×N/2.
The encoding information about coding units having a tree structure may include at least one of a coding unit corresponding to a coded depth, a prediction unit, and a minimum unit. The coding unit corresponding to the coded depth may include at least one of a prediction unit and a minimum unit containing the same encoding information.
Accordingly, it is determined whether adjacent data units are included in the same coding unit corresponding to the coded depth by comparing encoding information of the adjacent data units. Also, a corresponding coding unit corresponding to a coded depth is determined by using encoding information of a data unit, and thus a distribution of coded depths in a maximum coding unit may be determined.
Thus, if a current coding unit is predicted based on encoding information of adjacent data units, encoding information of data units in deeper coding units adjacent to the current coding unit may be directly referred to and used.
Alternatively, if a current coding unit is predicted based on encoding information of adjacent data units, data units adjacent to the current coding unit are searched using encoded information of the data units, and the searched adjacent coding units may be referred for predicting the current coding unit.
A maximum coding unit 1300 includes coding units 1302, 1304, 1306, 1312, 1314, 1316, and 1318 of coded depths. Here, since the coding unit 1318 is a coding unit of a coded depth, split information may be set to 0. Information about a partition type of the coding unit 1318 having a size of 2N×2N may be set to be one of a partition type 1322 having a size of 2N×2N, a partition type 1324 having a size of 2N×N, a partition type 1326 having a size of N×2N, a partition type 1328 having a size of N×N, a partition type 1332 having a size of 2N×nU, a partition type 1334 having a size of 2N×nD, a partition type 1336 having a size of nL×2N, and a partition type 1338 having a size of nR×2N.
When the partition type is set to be symmetrical, i.e. the partition type 1322, 1324, 1326, or 1328, a transformation unit 1342 having a size of 2N×2N is set if split information (TU size flag) of a transformation unit is 0, and a transformation unit 1344 having a size of N×N is set if a TU size flag is 1.
When the partition type is set to be asymmetrical, i.e., the partition type 1332, 1334, 1336, or 1338, a transformation unit 1352 having a size of 2N×2N is set if a TU size flag is 0, and a transformation unit 1354 having a size of N/2×N/2 is set if a TU size flag is 1.
Referring to
In detail, the deblocking filtering unit 130 may determine a filtering boundary based on boundaries of data units having a predetermined size or above from among the coding units, the prediction units, and the transformation units. In other words, referring to
Referring to
Meanwhile, the deblocking filtering unit 130 does not determine a boundary of data units having a predetermined size or above as a filtering boundary if the boundary is a frame boundary. In other words, deblocking filtering according to an exemplary embodiment is not performed on an outermost boundary corresponding to an edge of a picture.
When filtering boundaries to which deblocking filtering is to be performed are determined based on boundaries of data units having a predetermined size or above, the deblocking filtering unit 130 determines filtering strength in the filtering boundaries based on a prediction mode of a coding unit to which adjacent pixels belong based on the filtering boundary and transformation coefficient values of pixels adjacent to the filtering boundary.
Hereinafter, a process of performing deblocking filtering based on filtering boundaries, such as a horizontal direction filtering boundary 1810 and a vertical direction filtering boundary 1820 between data unit 1840 and data unit 1850 in data unit 1800 of
Referring to
The deblocking filtering unit 130 determines filtering strength based on whether the prediction mode of the coding unit to which the adjacent pixels belong based on the filtering boundary is an intra mode or an inter mode, and whether the transformation coefficient values of the pixels adjacent to the filtering boundary are 0. When boundary strength (Bs) denotes filtering strength, Bs may be classified into 5 stages from 0 through 4. A size of Bs is proportional to the filtering strength. In other words, when Bs=4, filtering strength is the strongest and when Bs=0, filtering strength is the weakest. Here, deblocking filtering may not be performed when Bs=0.
In detail, when p0 and q0 denote pixels that are adjacent to the filtering boundary and are divided based on the filtering boundary, the deblocking filtering unit 130 may determine the filtering strength to have a value of Bs=4 when a prediction mode of at least one coding unit to which p0 and q0 belong is an intra mode and the filtering boundary is a boundary of coding units. For example, when deblocking filtering is performed based on the horizontal direction filtering boundary 1810 of
Alternatively, the deblocking filtering unit 130 determines the filtering strength to have a value of Bs=3 when the prediction mode of at least one of the coding units to which p0 and q0 belong is an intra mode and the filtering boundary is not a boundary of coding units.
Alternatively, the deblocking filtering unit 130 determines the filtering strength to have a value of Bs=2 when the prediction modes of the coding units to which p0 and q0 belong are not intra modes and a transformation coefficient value of at least one of transformation units to which p0 and q0 belong is not 0.
Alternatively, the deblocking filtering unit 130 determines the filtering strength to have a value of Bs=1 when the prediction modes of the coding units to which p0 and q0 belong are not intra modes, the transformation coefficient values of the transformation units to which p0 and q0 belong are 0, and any one of a reference frame and a motion vector used for motion prediction of prediction units to which p0 and q0 belong is different each other.
Alternatively, the deblocking filtering unit 130 determines the filtering strength to have a value of Bs=0 when the prediction modes of the coding units to which p0 and q0 belong are not intra modes, the transformation coefficient values of the transformation units to which p0 and q0 belong are 0, and the reference frame and the motion vector used for motion prediction of the prediction units to which p0 and q0 belong are the same.
Meanwhile, the deblocking filtering unit 130 may determine whether to perform deblocking filtering on a filtering boundary based on the filtering strength and a result of comparing a predetermined threshold value and a difference between absolute values of pixel values of a predetermined number of adjacent pixels based on the filtering boundary. In detail, the deblocking filtering unit 130 determines to perform deblocking filtering only when an absolute value of a difference between pixel values of pixels adjacent to the filtering boundary and divided based on the filtering boundary and an absolute value of a difference between pixel values of pixels adjacent to the same side based on the filtering boundary are smaller than a predetermined threshold value determined according to a quantization parameter of transformation units to which pixels belong, and the filtering strength is not the weakest. For example, the deblocking filtering unit 130 may perform deblocking filtering on a filtering boundary only when i) the filtering strength Bs is not 0 and ii) a condition of |p0−q0|<α; |p1−q0|<β; |q1−q0|<β is satisfied. Here, the threshold values may be predetermined based on quantization parameters used during quantization of transformation units to which p0 and q0 belong.
Referring to
As such, the deblocking filtering unit 130 determines whether to perform deblocking filtering on a filtering boundary based on quantization parameters of transformation units to which p0 and q0 belong and predetermined offset values.
With respect to a boundary to which deblocking filtering is to be performed, the deblocking filtering unit 130 determines a number and filter tap coefficients of pixels to be filtered adjacent to a filtering boundary, based on filtering strength, an absolute value of a difference between pixel values of pixels adjacent to the filtering boundary and divided based on the filtering boundary, and an absolute value of a difference between pixel values of pixels adjacent to the same side based on the filtering boundary. Also, the deblocking filtering unit 130 performs filtering by changing pixel values of pixels to be filtered via a weighted sum based on the filter tap coefficients.
In detail, when filtering strength of a current filtering boundary is Bs<4, the deblocking filtering unit 130 generates p1′, p0′, q0′, and q1′ by using a 4-tap finite impulse response filter (FIR) using p1, p0, q0, and q1 as inputs. The deblocking filtering unit 130 generates a value of delta Δ according to an equation: Δ=clip3[−tc,tc,((((q0−p0)«2)+(p1−q1)*4)»3)). Here, tc may be determined based on |p2−p0|, |q2−q0| and a threshold value β.
Also, the deblocking filtering unit 130 generates pixel values of p0 and q0 that are nearest to a filtering boundary and deblocking filtered according to equations p0′=p0+A and q0′=q0+A. Pixel values of p1 and q1 that are adjacent to the filtering boundary after p0 and q0 are changed according to equations p1′=p1+Δ/2 and q1′=q1+Δ/2.
Meanwhile, when filtering strength has a value of Bs=4 in a current filtering boundary, the deblocking filtering unit 130 determines a number and filter tap coefficients of pixels that are filtered adjacent to the filtering boundary based on an absolute value of a difference between pixel values of pixels adjacent to the filtering boundary and divided based on the filtering boundary and an absolute value of a difference between pixel values of pixels adjacent to the same side based on the filtering boundary. In detail, when |p2−p0|<β, |p0−q0|<round(α/4), the deblocking filtering unit 130 sets input pixel values to be p2, p1, p0, q0, and q1 with respect to p0 nearest to the filtering boundary, and generates p0′ that is filtered by using a 5-tap filter having a filter tap coefficient of {1,2,2,2,1}.
With respect to p1 nearest to the filtering boundary after p0, the deblocking filtering unit 130 sets input pixel values to be p2, p1, p0, and q1, and generates p1′ filtered by using a 4-tap filter having a filter tap coefficient of {1,1,1,1}.
With respect to p2 nearest to the filtering boundary after p1, the deblocking filtering unit 130 sets input pixel values to be p3, p2, p1, p0, and q1, and generates p2′ filtered by using a 5-tap filter having a filter tap coefficient of {2, 3, 1, 1, 1}.
Similarly, when |q2−q0|<β; |p0−q0|<round(α/4) the deblocking filtering unit 130 sets input pixel values to be q2, q1, q0, p0, and p1 with respect to q0 nearest to the filtering boundary, and generates q0′ filtered by using a 5-tap filter having a filter tap coefficient of {1,2,2,2,1}.
With respect to q1 nearest to the filtering boundary after q0, the deblocking filtering unit 130 sets input pixel values to be q2, q1, q0, and p1, and generates q1′ filtered by using a 4-tap filter having a filter tap coefficient of {1,1,1,1}.
With respect to q2 nearest to the filtering boundary after q1, the deblocking filtering unit 130 sets input pixel values to be q3, q2, q1, q0, and p0, and generates q2′ filtered by using a 5-tap filter having a filter tap coefficient of {2, 3, 1, 1, 1}.
The deblocking filtering unit 230 of the video decoding apparatus 200 according to an exemplary embodiment determines a filtering boundary to which deblocking filtering is to be performed from boundaries of at least one data unit from among coding units according to a tree structure, prediction units, and transformation units, determines filtering strength in the filtering boundary based on a prediction mode of coding units to which adjacent pixels belong based on the filtering boundary and transformation coefficient values of pixels adjacent to the filtering boundary, and performs deblocking filtering on decoded image data based on the filtering strength, by using information about deblocking filtering parsed from a bitstream. Since operations of the deblocking filtering unit 230 of the video decoding apparatus 200 are similar to those of the deblocking filtering unit 120 of the video encoding apparatus 100, detailed descriptions thereof will not be repeated.
Referring to
In operation 2330, the deblocking filtering unit 130 determines a filtering boundary on which deblocking filtering is to be performed based on at least one data unit from among the coding units, prediction units, and the transformation units. As described above, the filtering boundary may be determined based on a boundary of data units having a predetermined size or above.
The deblocking filtering unit 130 determines filtering strength at the filtering boundary based on a prediction mode of a coding unit to which adjacent pixels belong based on the filtering boundary, and transformation coefficient values of pixels adjacent to the filtering boundary in operation 2340, and performs deblocking filtering based on the determined filtering strength in operation 2350.
Referring to
In operation 2420, the decoder 220 determines prediction units and transformation units for prediction and transformation according to the coding units and decodes the encoded image data, based on the encoding mode information about the coding units according to the tree structure.
In operation 2430, the deblocking filtering unit 230 determines a filtering boundary to which deblocking filtering is to be performed from among boundaries of at least one data unit from among the coding units according to the tree structure, the prediction units, and the transformation units, by using the information about the deblocking filtering.
In operation 2440, the deblocking filtering unit 230 determines filtering strength of the filtering boundary based on a prediction mode of a coding unit to which adjacent pixels based on the determined filtering boundary belong and transformation coefficient values of pixels adjacent to the filtering boundary.
In operation 2450, the deblocking filtering unit 230 performs deblocking filtering on the decoded image data based on the determined filtering strength.
The exemplary embodiments can be written as computer programs and can be implemented in general-use digital computers that execute the programs using a computer readable recording medium. Examples of the computer readable recording medium include magnetic storage media (e.g., ROM, floppy disks, hard disks, etc.) and optical recording media (e.g., CD-ROMs, or DVDs).
While the exemplary embodiments has been particularly shown and described, it will be understood by those of ordinary skill in the art that various changes in form and details may be made therein without departing from the spirit and scope of the inventive concept as defined by the appended claims. The exemplary embodiments should be considered in a descriptive sense only and not for purposes of limitation.
This application is a Continuation of U.S. patent application Ser. No. 15/647,720 filed Jul. 12, 2017, which is a Continuation of U.S. patent application Ser. No. 13/641,403 filed Dec. 11, 2012, and issued as U.S. Pat. No. 9,712,822, which is a National Stage application under 35 U.S.C. § 371 of PCT/KR2011/002647 filed on Apr. 13, 2011, which claims priority from U.S. Provisional Application No. 61/323,449, filed on Apr. 13, 2010 in the United States Patent and Trademark Office, the disclosures of which are incorporated herein by reference in their entirety.
Number | Date | Country | |
---|---|---|---|
61323449 | Apr 2010 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 15647720 | Jul 2017 | US |
Child | 16016031 | US | |
Parent | 13641403 | Dec 2012 | US |
Child | 15647720 | US |