The present invention relates generally to digital video signal processing and, in particular, to a method, apparatus and system for generating intra-predicted samples for a video frame of video data. The present invention also relates to a computer program product including a computer readable medium having recorded thereon a computer program for generating intra-predicted samples for a video frame of video data.
Many applications for video coding currently exist, including applications for transmission and storage of video data. Many video coding standards have also been developed and others are currently in development. Recent developments in video coding standardisation have led to the formation of a group called the “Joint Collaborative Team on Video Coding” (JCT-VC). The Joint Collaborative Team on Video Coding (JCT-VC) includes members of Study Group 16, Question 6 (SG16/Q6) of the Telecommunication Standardisation Sector (ITU-T) of the International Telecommunication Union (ITU), known as the Video Coding Experts Group (VCEG), and members of the International Organisations for Standardisation/International Electrotechnical Commission Joint Technical Committee 1/Subcommittee 29/Working Group 11 (ISO/IEC JTC1/SC29/WG11), also known as the Moving Picture Experts Group (MPEG).
The Joint Collaborative Team on Video Coding (JCT-VC) has the goal of producing a new video coding standard to significantly outperform a presently existing video coding standard, known as “H.264/MPEG-4 AVC”. The performance if a video coding standard is measured in multiple ways. A measure of the complexity of the algorithms present in or proposed for a video coding standard is used to estimate the incremental cost or saving of introducing a particular algorithm into the video coding standard. One simple measure of complexity is the run-time of a software implementation of the video coding standard. A measure of the ability of an implementation of a video coding standard to compactly represent uncompressed video data is known as the ‘coding efficiency’. Implementations of video coding standards typically introduce distortion into the decompressed video data. This is known as ‘lossy’ compression and enables higher coding efficiency to be achieve. As a result, the measure of coding efficiency must consider both a measure of distortion (e.g. PSNR) versus a measure of bit-rate for the compressed video data (the ‘bitstream’). The H.264/MPEG-4 AVC standard is itself a large improvement on previous video coding standards, such as MPEG-4 and ITU-T H.263. The new video coding standard under development has been named “high efficiency video coding (HEVC)”. Further development of high efficiency video coding (HEVC) is directed towards introducing support of different representations of chroma information present in video data, known as ‘chroma formats’. The Joint Collaborative Team on Video Coding (JCT-VC) is also considering implementation challenges arising from technology proposed for high efficiency video coding (HEVC) that create difficulties when scaling implementations of the standard to operate at high resolutions in real-time or high frame rates. The complexity of algorithms present in high efficiency video coding (HEVC) affects implementations, for example, the circuit size of hardware implementations.
One aspect of the coding efficiency achievable with a particular video coding standard is the characteristics of available prediction methods. For video coding standards intended for compression sequences of two-dimensional video frames, there are two types of prediction: intra-prediction and inter-prediction. Intra-prediction methods allow content of one part of a video frame to be predicted from other parts of the same video frame. Intra-prediction methods typically produce a block having a directional texture, with an intra-prediction mode specifying the direction of the texture and neighbouring samples within a frame used as a basis to produce the texture. Inter-prediction methods allow the content of a block within a video frame to be predicted from blocks in previous video frames. The previous video frames may be referred to as ‘reference frames’. The first video frame within a sequence of video frames typically uses intra-prediction for all blocks within the frame, as no prior frame is available for reference. Subsequent video frames may use one or more previous video frames from which to predict blocks. To achieve the highest coding efficiency, the prediction method that produces a predicted block that is closest to the original video data is typically used. The remaining difference between the predicted block and the original video data is known as the ‘residue’. A lossy representation of the residue, known as the ‘residual’ may be stored in the bitstream. The amount of lossiness in the residual affects the distortion of video data decoded from the bitstream compared to the original video data and the size of the bitstream.
The ‘chroma formats’, used to represent video data, specify the sample aspect ratio between a luma and multiple chroma channels of the video data. The aspect ratio implies a fixed relationship between collocated block sizes for luma and chroma for each chroma format. The fixed relationships also affect the available transform sizes used for the luma channel and chroma channels of a collocated block. When video data is represented using a “4:2:2” chroma format, a non-square relationship exists between the luma samples and the chroma samples.
One consequence of the non-square block size used in the chroma channels for a 4:2:2 chroma format is that the directional texture of the intra-prediction operation is distorted in a chroma channel, compared to the luma channel. The distortion reduces the accuracy of the predicted block for a chroma channel. To compensate for the distortion, an increase in the size of the residual for a chroma channel is required. The increase results in an undesirable reduction in the coding efficiency achieved by implementations of the video coding standard.
It is an object of the present invention to substantially overcome, or at least ameliorate, one or more disadvantages of existing arrangements.
According to one aspect of the present disclosure there is provided a method of generating intra-predicted samples for a chroma channel of a video bitstream configured for a 4:2:2 chroma format, the method comprising:
determining an intra-prediction angle from an intra-prediction mode for the chroma channel, the intra-prediction mode being one of a plurality of horizontal intra-prediction modes;
adjusting the intra-prediction angle due to the 4:2:2 chroma format;
modifying a change threshold between the horizontal intra-prediction modes and vertical intra-prediction modes if the adjusted angle exceeds a predetermined value, the modified change threshold being configured for converting the adjusted intra-prediction angle from one of the plurality of horizontal intra-prediction modes to a vertical intra-prediction mode; and
generating intra-predicted samples using a vertical intra-prediction mode according to the adjusted intra-prediction angle, and the change threshold.
According to another aspect of the present disclosure there is provided a system for generating intra-predicted samples for a chroma channel of a video bitstream configured for a 4:2:2 chroma format, the system comprising:
a memory for storing data and a computer program;
a processor coupled to the memory for executing said computer program, said computer program comprising instructions for:
According to another aspect of the present disclosure there is provided an apparatus for generating intra-predicted samples for a chroma channel of a video bitstream configured for a 4:2:2 chroma format, the apparatus comprising:
means for determining an intra-prediction angle from an intra-prediction mode for the chroma channel, the intra-prediction mode being one of a plurality of horizontal intra-prediction modes;
means for adjusting the intra-prediction angle due to the 4:2:2 chroma format;
means for modifying a change threshold between the horizontal intra-prediction modes and vertical intra-prediction modes if the adjusted angle exceeds a predetermined value, the modified change threshold being configured for converting the adjusted intra-prediction angle from one of the plurality of horizontal intra-prediction modes to a vertical intra-prediction mode; and
means for generating intra-predicted samples using a vertical intra-prediction mode according to the adjusted intra-prediction angle, and the change threshold.
According to still another aspect of the present disclosure there is provided a computer readable medium comprising a computer program for generating intra-predicted samples for a chroma channel of a video bitstream configured for a 4:2:2 chroma format, the program comprising:
code for determining an intra-prediction angle from an intra-prediction mode for the chroma channel, the intra-prediction mode being one of a plurality of horizontal intra-prediction modes;
code for adjusting the intra-prediction angle due to the 4:2:2 chroma format;
code for modifying a change threshold between the horizontal intra-prediction modes and vertical intra-prediction modes if the adjusted angle exceeds a predetermined value, the modified change threshold being configured for converting the adjusted intra-prediction angle from one of the plurality of horizontal intra-prediction modes to a vertical intra-prediction mode; and
code for generating intra-predicted samples using a vertical intra-prediction mode according to the adjusted intra-prediction angle, and the change threshold.
According to still another aspect of the present disclosure there is provided a method of generating intra-predicted samples for a chroma channel of a video bitstream configured for a 4:2:2 chroma format, the method comprising:
determining an intra-prediction angle from an intra-prediction mode for the chroma channel, the intra-prediction mode being one of a plurality of horizontal intra-prediction modes;
adjusting the intra-prediction angle due to the 4:2:2 chroma format;
modifying a change threshold between the horizontal intra-prediction modes and vertical intra-prediction modes if the bitstream is configured for a 4:2:2 chroma format, the modified change threshold being configured for converting the adjusted intra-prediction angle from one of the plurality of horizontal intra-prediction modes to a vertical intra-prediction mode; and
generating intra-predicted samples using a vertical intra-prediction mode according to the adjusted intra-prediction angle, and the change threshold.
According to still another aspect of the present disclosure there is provided a system for generating intra-predicted samples for a chroma channel of a video bitstream configured for a 4:2:2 chroma format, the system comprising:
a memory for storing data and a computer program;
a processor coupled to the memory for executing said computer program, said computer program comprising instructions for:
According to still another aspect of the present disclosure there is provided an apparatus for generating intra-predicted samples for a chroma channel of a video bitstream configured for a 4:2:2 chroma format, the apparatus comprising:
means for determining an intra-prediction angle from an intra-prediction mode for the chroma channel, the intra-prediction mode being one of a plurality of horizontal intra-prediction modes;
means for adjusting the intra-prediction angle due to the 4:2:2 chroma format;
means for modifying a change threshold between the horizontal intra-prediction modes and vertical intra-prediction modes if the bitstream is configured for a 4:2:2 chroma format, the modified change threshold being configured for converting the adjusted intra-prediction angle from one of the plurality of horizontal intra-prediction modes to a vertical intra-prediction mode; and
means for generating intra-predicted samples using a vertical intra-prediction mode according to the adjusted intra-prediction angle, and the change threshold.
According to still another aspect of the present disclosure there is provided a computer readable medium comprising a computer program for generating intra-predicted samples for a chroma channel of a video bitstream configured for a 4:2:2 chroma format, the program comprising:
code for determining an intra-prediction angle from an intra-prediction mode for the chroma channel, the intra-prediction mode being one of a plurality of horizontal intra-prediction modes;
code for adjusting the intra-prediction angle due to the 4:2:2 chroma format;
code for modifying a change threshold between the horizontal intra-prediction modes and vertical intra-prediction modes if the bitstream is configured for a 4:2:2 chroma format, the modified change threshold being configured for converting the adjusted intra-prediction angle from one of the plurality of horizontal intra-prediction modes to a vertical intra-prediction mode; and
code for generating intra-predicted samples using a vertical intra-prediction mode according to the adjusted intra-prediction angle, and the change threshold.
According to still another aspect of the present disclosure there is provided a method of generating intra-predicted samples for a chroma channel of a video bitstream, the method comprising:
decoding a chroma format of the video bitstream;
adjusting an intra-prediction mode prior to determining an angle parameter, wherein the adjustment is dependent on the decoded chroma format;
determining the angle parameter from the adjusted intra-prediction mode;
generating reference samples using the determined angle parameter; and
generating intra-predicted samples using the determined angle parameter and the generated reference samples.
According to still another aspect of the present disclosure there is provided a system for generating intra-predicted samples for a chroma channel of a video bitstream, the system comprising:
a memory for storing data and a computer program;
a processor coupled to the memory for executing said computer program, said computer program comprising instructions for:
According to still another aspect of the present disclosure there is provided an apparatus for generating intra-predicted samples for a chroma channel of a video bitstream, the method comprising:
means for decoding a chroma format of the video bitstream;
means for adjusting an intra-prediction mode prior to determining an angle parameter, wherein the adjustment is dependent on the decoded chroma format;
means for determining the angle parameter from the adjusted intra-prediction mode;
means for generating reference samples using the determined angle parameter; and
generating intra-predicted samples using the determined angle parameter and the generated reference samples.
According to still another aspect of the present disclosure there is provided a computer readable medium comprising a computer program for generating intra-predicted samples for a chroma channel of a video bitstream, the computer program comprising:
code for decoding a chroma format of the video bitstream;
code for adjusting an intra-prediction mode prior to determining an angle parameter, wherein the adjustment is dependent on the decoded chroma format;
code for determining the angle parameter from the adjusted intra-prediction mode;
code for generating reference samples using the determined angle parameter; and
generating intra-predicted samples using the determined angle parameter and the generated reference samples.
Other aspects are also disclosed.
At least one embodiment of the present invention will now be described with reference to the following drawings and and appendices, in which:
Appendix A shows an example of the method of generating intra-predicted samples that accords with
Appendix B shows an example of the method of generating intra-predicted samples that accords with
Appendix C shows an example of the method of generating intra-predicted samples that accords with
Appendix D shows an example of the method of generating intra-predicted samples that accords with
Where reference is made in any one or more of the accompanying drawings to steps and/or features, which have the same reference numerals, those steps and/or features have for the purposes of this description the same function(s) or operation(s), unless the contrary intention appears.
As shown in
The destination device 130 includes a receiver 132, a video decoder 134 and a display device 136. The receiver 132 receives encoded video data from the communication channel 120 and passes received video data to the video decoder 134. The video decoder 134 then outputs decoded frame data to the display device 136. Examples of the display device 136 include a cathode ray tube, a liquid crystal display, such as in smart-phones, tablet computers, computer monitors or in stand-alone television sets. It is also possible for the functionality of each of the source device 110 and the destination device 130 to be embodied in a single device.
Notwithstanding the example devices mentioned above, each of the source device 110 and destination device 130 may be configured within a general purpose computing system, typically through a combination of hardware and software components.
The computer module 201 typically includes at least one processor unit 205, and a memory unit 206. For example, the memory unit 206 may have semiconductor random access memory (RAM) and semiconductor read only memory (ROM). The computer module 201 also includes an number of input/output (I/O) interfaces including: an audio-video interface 207 that couples to the video display 214, loudspeakers 217 and microphone 280; an I/O interface 213 that couples to the keyboard 202, mouse 203, scanner 226, camera 227 and optionally a joystick or other human interface device (not illustrated); and an interface 208 for the external modem 216 and printer 215. In some implementations, the modem 216 may be incorporated within the computer module 201, for example within the interface 208. The computer module 201 also has a local network interface 211, which permits coupling of the computer system 200 via a connection 223 to a local-area communications network 222, known as a Local Area Network (LAN). As illustrated in
The I/O interfaces 208 and 213 may afford either or both of serial and parallel connectivity, the former typically being implemented according to the Universal Serial Bus (USB) standards and having corresponding USB connectors (not illustrated). Storage devices 209 are provided and typically include a hard disk drive (HDD) 210. Other storage devices such as a floppy disk drive and a magnetic tape drive (not illustrated) may also be used. An optical disk drive 212 is typically provided to act as a non-volatile source of data. Portable memory devices, such optical disks (e.g. CD-ROM, DVD, Blu-ray Disc™), USB-RAM, portable, external hard drives, and floppy disks, for example, may be used as appropriate sources of data to the computer system 200. Typically, any of the HDD 210, optical drive 212, networks 220 and 222 may also be configured to operate as the video source 112, or as a destination for decoded video data to be stored for reproduction via the display 214.
The components 205 to 213 of the computer module 201 typically communicate via an interconnected bus 204 and in a manner that results in a conventional mode of operation of the computer system 200 known to those in the relevant art. For example, the processor 205 is coupled to the system bus 204 using a connection 218. Likewise, the memory 206 and optical disk drive 212 are coupled to the system bus 204 by connections 219. Examples of computers on which the described arrangements can be practised include IBM-PC's and compatibles, Sun SPARCstations, Apple Mac™ or alike computer systems.
Where appropriate or desired, the video encoder 114 and the video decoder 134, as well as methods described below, may be implemented using the computer system 200 wherein the video encoder 114, the video decoder 134 and the process of
The software may be stored in a computer readable medium, including the storage devices described below, for example. The software is loaded into the computer system 200 from the computer readable medium, and then executed by the computer system 200. A computer readable medium having such software or computer program recorded on the computer readable medium is a computer program product. The use of the computer program product in the computer system 200 preferably effects an advantageous apparatus for implementing the video encoder 114, the video decoder 134 and the described methods.
The software 233 is typically stored in the HDD 210 or the memory 206. The software is loaded into the computer system 200 from a computer readable medium, and executed by the computer system 200. Thus, for example, the software 233 may be stored on an optically readable disk storage medium (e.g., CD-ROM) 225 that is read by the optical disk drive 212.
In some instances, the application programs 233 may be supplied to the user encoded on one or more CD-ROMs 225 and read via the corresponding drive 212, or alternatively may be read by the user from the networks 220 or 222. Still further, the software can also be loaded into the computer system 200 from other computer readable media. Computer readable storage media refers to any non-transitory tangible storage medium that provides recorded instructions and/or data to the computer system 200 for execution and/or processing. Examples of such storage media include floppy disks, magnetic tape, CD-ROM, DVD, Blu-ray Disc, a hard disk drive, a ROM or integrated circuit, USB memory, a magneto-optical disk, or a computer readable card such as a PCMCIA card and the like, whether or not such devices are internal or external of the computer module 201. Examples of transitory or non-tangible computer readable transmission media that may also participate in the provision of the software, application programs, instructions and/or video data or encoded video data to the computer module 401 include radio or infra-red transmission channels as well as a network connection to another computer or networked device, and the Internet or Intranets including e-mail transmissions and information recorded on Websites and the like.
The second part of the application programs 233 and the corresponding code modules mentioned above may be executed to implement one or more graphical user interfaces (GUIs) to be rendered or otherwise represented upon the display 214. Through manipulation of typically the keyboard 202 and the mouse 203, a user of the computer system 200 and the application may manipulate the interface in a functionally adaptable manner to provide controlling commands and/or input to the applications associated with the GUI(s). Other forms of functionally adaptable user interfaces may also be implemented, such as an audio interface utilizing speech prompts output via the loudspeakers 217 and user voice commands input via the microphone 280.
When the computer module 201 is initially powered up, a power-on self-test (POST) program 250 executes. The POST program 250 is typically stored in a ROM 249 of the semiconductor memory 206 of
The operating system 253 manages the memory 234 (209, 206) to ensure that each process or application running on the computer module 201 has sufficient memory in which to execute without colliding with memory allocated to another process. Furthermore, the different types of memory available in the computer system 200 of
As shown in
The application program 233 includes a sequence of instructions 231 that may include conditional branch and loop instructions. The program 233 may also include data 232 which is used in execution of the program 233. The instructions 231 and the data 232 are stored in memory locations 228, 229, 230 and 235, 236, 237, respectively. Depending upon the relative size of the instructions 231 and the memory locations 228-230, a particular instruction may be stored in a single memory location as depicted by the instruction shown in the memory location 230. Alternately, an instruction may be segmented into a number of parts each of which is stored in a separate memory location, as depicted by the instruction segments shown in the memory locations 228 and 229.
In general, the processor 205 is given a set of instructions which are executed therein. The processor 205 waits for a subsequent input, to which the processor 205 reacts to by executing another set of instructions. Each input may be provided from one or more of a number of sources, including data generated by one or more of the input devices 202, 203, data received from an external source across one of the networks 220, 202, data retrieved from one of the storage devices 206, 209 or data retrieved from a storage medium 225 inserted into the corresponding reader 212, all depicted in
The video encoder 114, the video decoder 134 and the described methods may use input variables 254, which are stored in the memory 234 in corresponding memory locations 255, 256, 257. The video encoder 114, the video decoder 134 and the described methods produce output variables 261, which are stored in the memory 234 in corresponding memory locations 262, 263, 264. Intermediate variables 258 may be stored in memory locations 259, 260, 266 and 267.
Referring to the processor 205 of
(a) a fetch operation, which fetches or reads an instruction 231 from a memory location 228, 229, 230;
(b) a decode operation in which the control unit 239 determines which instruction has been fetched; and
(c) an execute operation in which the control unit 239 and/or the ALU 240 execute the instruction.
Thereafter, a further fetch, decode, and execute cycle for the next instruction may be executed. Similarly, a store cycle may be performed by which the control unit 239 stores or writes a value to a memory location 232.
Each step or sub-process in the process of
Although the video encoder 114 of
The video encoder 114 divides each frame of the captured frame data, such as frame data 310, into regions generally referred to as ‘coding tree blocks’ (CTBs). Each coding tree block (CTB) includes a hierarchical quad-tree subdivision of a portion of the frame into a collection of ‘coding units’ (CUs). The coding tree block (CTB) generally occupies an area of 64×64 luma samples, although other sizes are possible, such as 16×16 or 32×32. In some cases even larger sizes for the coding tree block (CTB), such as 128×128 luma samples, may be used. The coding tree block (CTB) may be sub-divided via a split into four equal sized regions to create a new hierarchy level. Splitting may be applied recursively, resulting in a quad-tree hierarchy. As the coding tree block (CTB) side dimensions are always powers of two and the quad-tree splitting always results in a halving of the width and height, the region side dimensions are also always powers of two. When no further split of a region is performed, a ‘coding unit’ (CU) is said to exist within the region. When no split is performed at the top level (or typically the “highest level”) of the coding tree block, the region occupying the entire coding tree block contains one coding unit (CU) that is generally referred to as a ‘largest coding unit’ (LCU). A minimum size also exists for each coding unit (CU), such as the area occupied by 8×8 luma samples, although other minimum sizes are also possible. Coding units of the minimum size are generally referred to as ‘smallest coding units’ (SCUs). As a result of the quad-tree hierarchy, the entirety of the coding tree block (CTB) is occupied by one or more coding units (CUs).
The video encoder 114 produces one or more arrays of data samples, generally referred to as ‘prediction units’ (PUs) for each coding unit (CU). Various arrangements of prediction units (PUs) in each coding unit (CU) are possible, with a requirement that the prediction units (PUs) do not overlap and that the entirety of the coding unit (CU) is occupied by the one or more prediction units (PUs). The requirement that the prediction units (PUs) do not overlap and that the entirety of the coding unit (CU) is occupied by the one or more prediction units (PUs) ensures that the prediction units (PUs) cover the entire frame area.
The video encoder 114 operates by outputting, from a multiplexer module 340, a prediction unit (PU) 382. A difference module 344 outputs the difference between the prediction unit (PU) 382 and a corresponding 2D array of data samples, in the spatial domain, from a coding unit (CU) of the coding tree block (CTB) of the frame data 310, the difference being known as a ‘residual sample array’ 360. The residual sample array 360 may be transformed into the frequency domain in a transform module 320. The residual sample array 360 from the difference module 344 is received by the transform module 320, which converts (or ‘encodes’) the residual sample array 360 from a spatial representation to a frequency domain representation by applying a ‘forward transform’. The transform module 320 creates transform coefficients. The transform coefficients are configured as the residual transform array 362 for each transform in a transform unit (TU) in a hierarchical sub-division of the coding unit (CU). The coding unit (CU) is sub-divided into one or more transform units (TUs). The sub-divided coding unit (CU) may be referred to as a ‘residual quad-tree’ or a ‘residual quad-tree (RQT)’. The sub-division of the residual data of the coding unit (CU) into a residual quad-tree (RQT) is performed under control of a transform control module 346.
The transform control module 346 may test the bit-rate required in the encoded bitstream 312 for various possible arrangements of transform units (TUs) in the residual quad-tree of a present coding unit (CU) according to a ‘rate-distortion criterion’. The rate-distortion criterion is a measure of the acceptable trade-off between the bit-rate of the encoded bitstream 312, or a local region thereof, and the distortion, or difference between frames present in the frame buffer 332 and the captured frame data. In some arrangements, the rate-distortion criterion considers only the rate and distortion for luma and thus the encoding decision is made based only on characteristics of the luma channel. Generally, the residual quad-tree (RQT) is shared between luma and chroma, and the amount of chroma information is relatively small compared to luma, so considering luma only in the rate-distortion criterion is appropriate. Where decisions specific to chroma only need to be made, the rate-distortion criterion may be expanded to consider chroma bit costs and rate costs, or alternatively, a rule or ‘heuristic’ may be introduced in order to make a reasonable decision from chroma, based on the rate-distortion criterion decisions for luma. The transform control module 346 may thus select an arrangement of transform units (TUs) as the residual quad-tree. The selected arrangement is configured for encoding the residual sample array 360 of the present coding unit (CU) from a set of possible transform units (TUs). The configuration of the residual quad-tree (RQT) of the coding unit (CU) is specified by a set of split transform flags 386. The residual quad-tree (RQT) will be further discussed below, with reference to
The set of possible transform units (TUs) for a residual quad-tree is dependent on the available transform sizes and coding unit (CU) size. The residual quad-tree may result in a lower bit-rate in the encoded bitstream 312, thus achieving higher coding efficiency. A larger sized transform unit (TU) results in use of larger transforms for both luma and chroma. Generally, larger transforms provide a more compact representation of a residual sample array with sample data (or ‘residual energy’) spread across the residual sample array. Smaller transform units (TUs) provide a more compact representation of a residual sample array with residual energy localised to specific regions of the residual sample array. Thus, the many possible configurations of the residual quad-tree provides a useful means for achieving high coding efficiency of the residual sample array 360 in the high efficiency video coding (HEVC) standard under development.
For the high efficiency video coding (HEVC) standard under development, conversion of the residual sample array 360 to the frequency domain representation is implemented using a modified discrete cosine transform (DCT), in which a DCT is modified to be implemented using shifts and additions. Various sizes of the residual sample array 360 and the transform coefficients 362 are possible, in accordance with supported transform sizes. In the high efficiency video coding (HEVC) standard under development, transforms are performed on 2D arrays of data samples having sizes, such as 32×32, 16×16, 8×8 and 4×4. Thus, a predetermined set of transform sizes are available to the video encoder 114. Moreover, the set of transform sizes may differ between the luma channel and the chroma channels.
Two-dimensional transforms are generally configured to be ‘separable’, enabling implementation as a first set of 1D transforms operating on the 2D array of data samples in one direction (e.g. on rows). The first set of 1D transforms is followed by a second set of 1D transform operating on the 2D array of data samples output from the first set of 1D transforms in the other direction (e.g. on columns). Transforms having the same width and height are generally referred to as ‘square transforms’. Additional transforms, having differing widths and heights may also be used and are generally referred to as ‘non-square transforms’. The row and column one-dimensional transforms may be combined into specific hardware or software modules, such as a 4×4 transform module or an 8×8 transform module.
Transforms having larger dimensions require larger amounts of circuitry to implement, even though such larger dimensioned transforms may be infrequently used. Accordingly, the high efficiency video coding (HEVC) standard under development defines a maximum transform size of 32×32 luma samples. The integrated nature of the transform implementation defined for the high efficiency video coding (HEVC) standard under development also introduces a preference to reduce the number of non-square transform sizes supported. The non-square transform sizes typically require either entirely new hardware to be implemented for each non-square transform size or require additional selection logic to enable reconfiguration of various 1D transform logic into a particular non-square transform size. Additionally, non-square transform sizes may also increase the complexity of software implementations by introducing additional methods to perform transform and inverse transform operations for each supported non-square transform size, and increasing complexity to implement the necessary buffer management functionality of the additional transform sizes.
Transforms may be applied to both the luma and chroma channels. Differences between the handling of luma and chroma channels with regard to transform units (TUs) exist and will be discussed below with reference to
The transform coefficients 362 are input to the scale and quantise module 322 where data sample values thereof are scaled and quantised, according to a determined quantisation parameter 384, to produce a residual data array 364. The scale and quantisation results in a loss of precision, dependent on the value of the determined quantisation parameter 384. A higher value of the determined quantisation parameter 384 results in greater information being lost from the residual data. The lost information increases the compression achieved by the video encoder 114 at the expense of reducing the visual quality of output from the video decoder 134. The determined quantisation parameter 384 may be adapted during encoding of each frame of the frame data 310. Alternatively, the determined quantisation parameter 384 may be fixed for a portion of the frame data 310. In a further alternative, the determined quantisation parameter 384 may be fixed for an entire frame of frame data 310. Other adaptations of the determined quantisation parameter 384 are also possible, such as quantising different residual coefficients with separate values.
The residual data array 364 and determined quantisation parameter 384 are taken as input to an inverse scaling module 326. The inverse scaling module 326 reverses the scaling performed by the scale and quantise module 322 to produce resealed data arrays 366, which are resealed versions of the residual data array 364. The residual data array 364, the determined quantisation parameter 384 and the split transform flags 386 are also taken as input to an entropy encoder module 324. The entropy encoder module 324 encodes the values of the residual data array 364 in an encoded bitstream 312 (or ‘video bitstream’). Due to the loss of precision resulting from the scale and quantise module 322, the resealed data arrays 366 are not identical to the original values in the array 363. The resealed data arrays 366 from the inverse scaling module 326 are then output to an inverse transform module 328. The inverse transform module 328 performs an inverse transform from the frequency domain to the spatial domain to produce a spatial-domain representation 368 of the resealed transform coefficient arrays 366. The spatial-domain representation 368 is substantially identical to a spatial domain representation that is produced at the video decoder 134. The spatial-domain representation 368 is then input to a summation module 342.
A motion estimation module 338 produces motion vectors 374 by comparing the frame data 310 with previous frame data from one or more sets of frames stored in a frame buffer module 332, generally configured within the memory 206. The sets of frames are known as ‘reference picture lists’. The motion vectors 374 are then input to a motion compensation module 334 which produces an inter-predicted prediction unit (PU) 376 by filtering data samples stored in the frame buffer module 332, taking into account a spatial offset derived from the motion vectors 374. Not illustrated in
Prediction units (PUs) may be generated using either an intra-prediction or an inter-prediction method. Intra-prediction methods make use of data samples adjacent to the prediction unit (PU) that have previously been decoded (typically above and to the left of the prediction unit) in order to generate reference data samples within the prediction unit (PU). Various directions of intra-prediction are possible, referred to as the ‘intra-prediction mode’. Inter-prediction methods make use of a motion vector to refer to a block from a selected reference frame. The motion estimation module 338 and motion compensation module 334 operate on motion vectors 374, having a precision of one eighth (⅛) of a luma sample, enabling precise modelling of motion between frames in the frame data 310. The decision on which of the intra-prediction or the inter-prediction method to use is made according to a rate-distortion trade-off between desired bit-rate of the resulting encoded bitstream 312 and the amount of image quality distortion introduced by either the intra-prediction or inter-prediction method. If intra-prediction is used, one intra-prediction mode is selected from the set of possible intra-prediction modes, also according to a rate-distortion trade-off. The multiplexer module 340 selects either the intra-predicted reference samples 378 from the intra-frame prediction module 336, or the inter-predicted prediction unit (PU) 376 from the motion compensation block 334, depending on the decision made by a rate distortion algorithm.
The summation module 342 produces a sum 370 that is input to a de-blocking filter module 330. The de-blocking filter module 330 performs filtering along block boundaries, producing de-blocked samples 372 that are written to the frame buffer module 332 configured within the memory 206. The frame buffer module 332 is a buffer with sufficient capacity to hold data from one or more past frames for future reference as part of a reference picture list.
For the high efficiency video coding (HEVC) standard under development, the encoded bitstream 312 produced by the entropy encoder 324 is delineated into network abstraction layer (NAL) units. Generally, each slice of a frame is contained in one NAL unit. The entropy encoder 324 encodes the residual array 364, the intra-prediction mode 380, the motion vectors and other parameters, collectively referred to as ‘syntax elements’, into the encoded bitstream 312 by performing a context adaptive binary arithmetic coding (CABAC) algorithm. Syntax elements are grouped together into ‘syntax structures’. The groupings may contain recursion to describe hierarchical structures. In addition to ordinal values, such as an intra-prediction mode or integer values, such as a motion vector, syntax elements also include flags to indicate a quad-tree split for example.
Although the video decoder 134 of
As seen in
The encoded bitstream 312 is input to an entropy decoder module 420 which extracts the syntax elements from the encoded bitstream 312 and passes the values of the syntax elements to other blocks in the video decoder 134. The entropy decoder module 420 applies the context adaptive binary arithmetic coding (CABAC) algorithm to decode syntax elements from the encoded bitstream 312. The decoded syntax elements are used to reconstruct parameters within the video decoder 134. Parameters include zero or more residual data array 450, motion vectors 452, a prediction mode 454 and split transform flags 468. The residual data array 450 is passed to an inverse scale module 421, the motion vectors 452 are passed to a motion compensation module 434, and the prediction mode 454 is passed to an intra-frame prediction module 426 and to a multiplexer 428. The inverse scale module 421 performs inverse scaling on the residual data to create reconstructed data 455 in the form of transform coefficients. The inverse scale module 421 outputs the reconstructed data 455 to an inverse transform module 422. The inverse transform module 422 applies an ‘inverse transform’ to convert (or ‘decode’) the reconstructed data 455 (i.e., the transform coefficients) from a frequency domain representation to a spatial domain representation, outputting a residual sample array 456 via a multiplexer module 423. The inverse transform module 422 performs the same operation as the inverse transform module 328. The inverse transform module 422 is configured to perform transforms in accordance with the residual quad-tree specified by the split transform flags 468. The transforms performed by the inverse transform module 422 are selected from a predetermined set of transform sizes required to decode an encoded bitstream 312 that is compliant with the high efficiency video coding (HEVC) standard under development.
The motion compensation module 434 uses the motion vectors 452 from the entropy decoder module 420, combined with reference frame data 460 from a frame buffer block 432, configured within the memory 206, to produce an inter-predicted prediction unit (PU) 462 for a prediction unit (PU), being a prediction of output decoded frame data. When the prediction mode 454 indicates that the current prediction unit was coded using intra-prediction, the intra-frame prediction module 426 produces an intra-predicted prediction unit (PU) 464 for the prediction unit (PU) using data samples spatially neighbouring the prediction unit (PU) and a prediction direction also supplied by the prediction mode 454. The spatially neighbouring data samples are obtained from a sum 458, output from a summation module 424. The multiplexer module 428 selects the intra-predicted prediction unit (PU) 464 or the inter-predicted prediction unit (PU) 462 for a prediction unit (PU) 466, depending on the current prediction mode 454. The prediction unit (PU) 466, which is output from the multiplexer module 428, is added to the residual sample array 456 from the inverse scale and transform module 422 by the summation module 424 to produce sum 458. The sum 458 is then input to each of a de-blocking filter module 430 and the intra-frame prediction module 426. The de-blocking filter module 430 performs filtering along data block boundaries, such as transform unit (TU) boundaries, to smooth visible artefacts. The output of the de-blocking filter module 430 is written to the frame buffer module 432 configured within the memory 206. The frame buffer module 432 provides sufficient storage to hold one or more decoded frames for future reference. Decoded frames 412 are also output from the frame buffer module 432 to a display device, such as the display device 136 (e.g., in the form of the display device 214).
By sampling the luma samples at the luma sample locations and chroma samples at the chroma sample locations indicated in the frame portion 510, a sample grid is obtained for each colour channel when a 4:2:2 chroma format is applied. The same allocation of data samples to colour channels is made for the frame portion 510 as for the frame portion 500. In contrast to the frame portion 500, twice as many chroma sample locations exist in frame portion 510. In frame portion 510 the chroma sample locations are collocated with every second luma sample location. Accordingly, in
Various allowable dimensions of transform units were described above in units of luma samples. The region covered by a transform applied for the luma channel will thus have the same dimensions as the transform unit dimensions. As the transform units also encode chroma channels, the applied transform for each chroma channel will have dimensions adapted according to the particular chroma format in use. For example, when a 4:2:0 chroma format is in use, a 16×16 transform unit (TU) will use a 16×16 transform for the luma channel, and an 8×8 transform for each chroma channel.
The ‘residual quad-tree’ (RQT) defines a hierarchy that begins at a ‘root node’, covering a region containing one or more transform units (TUs) at each ‘leaf node’ of the hierarchy. At non-leaf nodes the region is divided into four equally-sized ‘sub-regions’, in a split known as a ‘quad-tree split’. Each transform unit (TU) has an associated size (or ‘transform size’), generally described as the dimensions of the region containing the transform unit (TU) on the luma sample grid, although the region may also be described as dimensions on the chroma sample grid. The size is dependent on the coding unit (CU) size and the transform depth. Transform units (TUs) with a transform depth of zero have a size equal to the size of the corresponding coding unit (CU). Each increment of the transform depth results in a halving of the dimensions (i.e the side width and height) of transform units (TUs) present in the residual quad-tree at the given transform depth. As the frame includes a luma channel and chroma channels, the coding unit (CU) occupies a region on both the luma sample grid and the chroma sample grid and thus each transform unit (TU) includes information describing both the luma samples on the luma sample grid and the chroma samples on the chroma sample grid. The nature of the information for each transform unit (TU) is dependent on the processing stage of the video encoder 114 or the video decoder 134. At the input to the transform module 320 and the output of the inverse scale and transform module 422, the residual sample array 360 and 456, respectively, contain information for each transform unit (TU) in the spatial domain. The residual sample arrays 360 and 456 may be further divided into a ‘chroma residual sample array’ and a ‘luma residual sample array’, due to differences in processing between the luma channel and the chroma channels. At the output of the scale and quantise module 322 and the input of the inverse scale and transform module 422, the residual data array 364 and 450 respectively contain information for each transform unit (TU) in the frequency domain. The residual data arrays 364 and 450 may be further divided into a ‘chroma residual data array’ and a ‘luma residual data array’, due to differences in processing between the luma channel and the chroma channels.
In
The decomposition of a coding unit (CU) into one or more prediction units (PUs) is referred to as a ‘partitioning’ and is generally specified by a ‘partition mode’ (or ‘part_mode’ syntax element) present in the encoded bitstream 312. The partition mode may specify that a single prediction unit (PU) occupy the entire coding unit (CU), or that multiple non-overlapping prediction units (PUs) occupy the entire coding unit (CU). For example, as seen in
Each inter-predicted prediction unit (PUs) has a motion vector and each intra-predicted prediction unit (PU) has a direction. Consequently, visual discontinuities are possible at the boundary between adjacent prediction units (PUs) due to different motion vector(s), direction(s) or combination of different motion vector(s) and direction(s). For a given partitioning, one or more resulting prediction units (PUs) are either all intra-predicted or all inter-predicted, but not a combination of intra-prediction and inter-prediction.
The decomposition of a coding unit (CU) into one or more transform units (TUs) is a quad-tree decomposition that is referred to as a ‘residual quad-tree’ (RQT). A residual quad-tree (RQT) is generally specified by one or more ‘split transform flags’ (or ‘split_transform_flag’ syntax elements) present in the encoded bitstream 312. For example, the coding unit (CU) 604 includes a residual quad-tree (RQT) 610 that divides the area of the coding unit (CU) 604 into four equal-sized regions. Each of the four equal-sized regions is not further sub-divided, resulting in four transform units (TUs), such as transform unit (TU) 612. Each transform unit (TU) includes transforms for the luma channel and for each chroma channel. When the video encoder 114 and the video decoder 134 are configured for the 4:2:0 chroma format, the transform boundary (or ‘edge’) for the luma channel and for each chroma channel are aligned to the transform unit (TU) boundary. In contrast, when the video encoder 114 and the video decoder 134 are configured for the 4:2:2 chroma format and square transforms are used for each chroma channel, additional transform boundaries are present for each chroma channel. At the boundary of a transform, discontinuities may be visible. The discontinuities reduce the perceived quality of decoded frames 412 compared to the frame data 310. The quantisation parameter applied by the scale and quantise block 322 and the inverse scale module 421 may vary between transform units (TUs). Accordingly, spatially neighbouring transforms may have different quantisation parameters applied. Generally, larger quantisation parameters and differences in the quantisation parameter applied to adjacent transforms result in poorer visual quality, due to increased transform block edge artefacts.
The high efficiency video coding (HEVC) standard under development defines thirty five (35) intra-prediction modes. Of the thirty five (35) intra-prediction modes, one intra-prediction mode is known as a ‘DC’ mode, one is a ‘planar’ mode and thirty three (33) are known as directional modes.
Angle parameters are integers from negative thirty-two (−32) to positive thirty-two (+32). Angle parameters may be interpreted as offsets along either a horizontal axis for the vertical intra-prediction modes or a vertical axis for the horizontal intra-prediction modes. For example, an angle parameter 706 has a value of 30 and exists along a vertical axis that includes all the horizontal intra-prediction modes 712. As shown in
A method 1100 of generating intra-predicted samples in the video encoder 114 or the video decoder 134, will be described in detail below with reference to
The directional intra-prediction modes have the property that the predicted samples will be a texture, having a specific direction and determined from reference samples, such as the reference samples 804. When a directional intra-prediction mode is selected, samples from a reference sample buffer are copied across into the prediction unit (PU) in the direction of intra-prediction. For example, an intra-prediction direction 810 results in a texture being produced for the prediction unit (PU) 802 that has an up-right direction. The texture is produced by copying each reference sample within the reference samples, such as reference sample 812, across the prediction unit (PU) 802 in the direction of the intra-prediction, which results in predicted samples(e.g., predicted sample 814) being assigned a value equal to the corresponding reference sample. For clarity,
For intra-prediction modes twenty-six (26) to thirty-four (34) only the above reference samples (e.g., 804) are required. For intra-prediction modes two (2) to ten (10) only the left reference samples (e.g., 806) are required. For intra-prediction modes eleven (11) to twenty-five (25) both the above reference samples (e.g., 804) and the left reference samples (e.g., 806) are required. For intra-prediction modes eleven (11) to twenty-five (25), an additional parameter named the ‘inverse angle’, or ‘invAngle’, is defined. For predominantly vertical intra-prediction modes, the inverse angle controls population of the left reference samples and for predominantly horizontal intra-prediction modes, the inverse angle controls population of the above reference samples (e.g., 804). When the inverse angle is defined, the value of the inverse angle is one of [−4096, −1638, −910, −630, −482, −390, −315, −256, −315, −390, −482, −630, −910, −1638, −4096] for intra-prediction modes eleven to twenty-five respectively. Symmetry at intra-prediction mode eighteen (18) is visible from the list of inverse angle values. Implementations may use the symmetry to reduce complexity. It is desirable to reduce complexity where such reduction in complexity introduces either negligible (or no) loss in coding efficiency. Moreover, although the range of inverse angle values varies from −256 to −4096, only eight (8) discrete values are defined. The inverse angle is used to control population of a reference sample buffer with neighbouring samples, by varying the spacing of accessing individual neighbouring samples. A right-shift of eight (8) bits is applied to the inverse angle when determining an offset for accessing each neigbouring sample. Due to this, an inverse angle value of −256 results in the reference samples being populated with adjacent neighbouring samples. An inverse angle value of −4096 results in reference samples being populated with every sixteenth neighbouring sample.
For predominantly vertical intra-prediction modes, the left reference samples are populated as if the left reference samples were an extension of the above reference samples, with left neighbouring samples projected onto the extension of the above reference samples in the opposite direction of intra-prediction (i.e., the direction illustrated in
The discrepancy between the effective angle for a given angle parameter across the luma sample grid 900 and the chroma sample grid 910 reduces coding efficiency. The coding efficiency is reduced because video data often includes features that are correlated across the luma channel and the chroma channels and thus this is a common scenario for intra-prediction.
The sum 458 is received from the summation module 424 and is the sum of a residual and a prediction from neighbouring block(s). A neighbouring sample buffer 1002 provides sufficient storage to hold left neighbouring samples in a left neighbouring buffer 1004 and above neighbouring samples in an above neighbouring buffer 1006. The sum 458 is produced prior to in-loop filtering performed by the deblocking filter module 430. The left neighbouring buffer 1004 outputs left neighbouring samples 1020 and the above neighbouring buffer 1006 outputs above neighbouring samples 1022. A reference sample generator module 1008 produces left reference samples (e.g., 806) and above reference samples (e.g., 804) which are collectively the reference samples 1028 according to angle parameter 1024 and inverse angle 1026. The reference sample generator module 1008 may generate reference samples by copying neighbouring samples from the neighbouring sample buffer 1002. The copying operation may operate in accordance with the description of
For any given value of the angle parameter and inverse angle (if defined), accessing of left reference samples and above reference samples by the reference sample block generator 1008 for predominantly vertical intra-prediction modes is the transpose of the access for predominantly horizontal intra-prediction modes. This symmetry may be exploited to reduce complexity of the reference sample generation by using the threshold between these two cases to control the operation of a transpose operation.
The reference sample block generator 1010 operates by copying samples from the reference samples 1028 to produce the intra-predicted prediction unit (PU) 464. For example, an intra-prediction mode of eighteen (18) results in an angle parameter of negative thirty-two (−32). The angle parameter of negative thirty-two results in each reference sample from the reference samples 1028 being copied into the prediction unit (PU) in a diagonal down-right pattern. Each location in the prediction unit (PU) is thus populated with a copy of a reference sample in accordance with the angle parameter to produce a block of samples containing a texture.
For any given value of the angle parameter, output of the reference sample block generator 1010 for predominantly vertical intra-prediction modes is the transpose (along an axis defined by intra-prediction mode eighteen (18)) of the output of the reference sample block generator 1010 for predominantly horizontal intra-prediction modes. This symmetry may be exploited to reduce complexity by using the threshold between these two cases to control the operation of a transpose operation.
The complexity of the sample block generator 1010 may be reduced by exploiting symmetries present in the values for the angle parameter. As each sample in the intra-predicted prediction unit (PU) 464 depends only on the reference samples 1028 and the angle parameter 1024, parallelised implementations of the sample block generator 1010 are possible.
Parallelised implementations of the sample block generator 1010 are advantageous for higher resolutions as the parallelised implementations enable ‘throughput’ (e.g. as measured by the number of samples produced per clock cycle) to be sufficient to support real-time operation at the maximum supported frame rate.
Generally, parallelised implementations have the disadvantage of duplicating logic, resulting in increased circuit size. It is beneficial to simplify the operations being parallelised to reduce the incremental cost of adding parallelism to an implementation of the sample block generator 1010.
Arrangements will be described below with reference to
The method 1100 begins at step determining step 1102 where the processor 205 is used for selecting an intra-prediction mode. The operation of the selecting step 1102 differs between the video encoder 114 and the video decoder 134 as will be described below.
Steps 1104-1126 have the same operation between the video encoder 114 and the video decoder 134. As shown in
In the video encoder 114, at step 1102, the processor 205 is used for selecting which intra-prediction mode is to be used for a prediction unit (PU). The intra-prediction mode selected at step 1102 may be one of a plurality of horizontal intra-prediction modes. Generally, at the step 1102, the processor 205 selects the intra-prediction mode giving the lowest distortion compared to a co-located block on the input frame data 310. The selected intra-prediction mode is encoded in the encoded bitstream 312 by the entropy encoder 324 and may be stored in the memory 206 and/or the HDD 210. In the video decoder 134, the intra-prediction mode is determined at step 1102 by using the entropy decoder 420 to decode syntax elements from the encoded bitstream 312.
At determining step 1104, the processor 205 is used for determining an intra-prediction angle represented by an intra-prediction angle parameter and an inverse angle for the intra-prediction mode for the chroma channel. The angle parameter is determined in accordance with
Then at chroma 4:2:2 test step 1105, the processor 205 is used for determining if the block of predicted samples to be generated is for a chroma channel and the chroma channel is using a 4:2:2 chroma format. If so, control passes to testing step 1106. Otherwise, control passes to generate reference samples step 1124.
Then at the testing step 1106, the processor 205 is used for determining if the predominant direction of the intra-prediction mode selected at step 1102 is vertical or horizontal. The testing step 1106 compares the intra-prediction mode predModeIntra with a threshold value of eighteen (18) and if predModeIntra is greater than or equal to the threshold, the predominant direction is vertical; otherwise the predominant direction is horizontal.
If the predominant direction of the intra-prediction mode is vertical, control passes to a halve angle step 1108. If the predominant direction of the intra-prediction mode is horizontal, control passes to a double angle step 1112. In the following steps 1108, 1110, 1112 and 1114, the method 1100 is used for adjusting the intra-prediction angle due to the 4:2:2 chroma format. The intra-prediction angle is adjusted depending on the predominant direction of the selected intra-prediction mode. For predominantly vertical intra-prediction modes, the angle parameter determined at step 1104 is reduced to compensate for the chroma sample grid. At the halve angle step 1108, the processor 205 is used for halving the intra-prediction angle by halving the angle parameter (e.g. by performing a right shift by one bit) determined at step 1104. Additionally, for intra-prediction modes eighteen (18) to twenty five (25) (where an inverse angle is defined), the step 1108 doubles the inverse angle. Step 1108 results in a new angle that most closely accords with the angle realised on the luma sample grid. For example, the step 1108 may map angle parameters from [0, 2, 5, 9, 13, 17, 21, 26, 32] to [0, 1, 2, 4, 6, 8, 10, 13, 16] and may map inverse angles from [−256, −315, −390, −482, −630, −910, −1638, −4096] to [−512, −630, −780, −964, −1260, −1820, −3276, −8192]. Alternatively, angle parameters may always be quantised downwards and inverse angle parameters upwards and vice versa.
One disadvantage of applying the angle parameter resulting from the halve angle step 1108 to the generate reference samples step 1124 and generate intra-predicted samples step 1126 as described below is that additional possible values for the angle parameter and inverse angle now exist. The additional possible values result in increased complexity in the steps 1124 and 1126. Arrangements providing parallelised implementations for the step 1126, such as described with reference to the sample block generator 1010 of
The method 1100 continues at quantise angle step 1110, where the processor 205 is used for quantising the halved angle parameter determined at step 1108 to closest pre-existing values for the angle parameter. Also at step 1110, the processor 205 is used for quantising the inverse angle to the closest pre-existing values for the inverse angle.
For intra-prediction mode twenty-five (25), the angle parameter was adjusted from negative two (−2) to negative one (−1) and the inverse angle was doubled from −4096 to −8192 in the step 1110. The step 1112 may then quantise the angle parameter from negative one (−1) to zero (0). In such a case, the inverse angle is no longer defined for intra-prediction mode twenty five (25) because intra-prediction mode twenty five (25) becomes identical to intra-prediction mode twenty six (26) (purely vertical intra-prediction). Intra-prediction mode twenty six (26) does not require the mechanism for generating reference samples using an inverse angle as was described with reference to
In one arrangement, the halve angle step 1108 and the quantise angle step 1110 may be combined into a single table look-up operation for reduced complexity. For example, the combined steps 1108 and 1110 may map angle parameters from [0, 2, 5, 9, 13, 17, 21, 26, 32] to [0, 0, 2, 5, 5, 9, 9, 13, 17] and may map inverse angles from [−256, −315, −390, −482, −630, −910, −1638, −4096] to [−482, −630, −910, −910, −1638, −1638, −4096, N/A].
Other arrangements may produce different mappings due to rounding differences during the halving of the angle parameter. Arrangements which produce different mappings due to rounding differences during the halving of the angle parameter produce different output from the quantisation but retain the property of using only pre-existing values for the angle parameter and inverse angle.
For predominantly horizontal intra-prediction modes, the angle parameter is increased to compensate for the chroma sample grid. At double angle step 1112, the processor 205 is used for doubling the intra-prediction angle by doubling the angle parameter (e.g. by performing a left shift by one bit) determined at step 1104. Step 1112 results in a new angle that closely accords with the angle realised on the luma sample grid. Again, one disadvantage of applying the angle parameter from the double angle step 1112 to the generate reference samples step 1124 and the generate intra-predicted samples step 1126 is that additional possible values for the angle parameter and inverse angle now exist. The additional possible values result in increased complexity in the steps 1124 and 1126. Again, arrangements providing parallelised implementations for step 1126, such as described with reference to the sample block generator 1010 of
A further disadvantage of doubling the angle parameter is that the allowable range for angle parameter is plus/minus thirty-two (+/−32) and doubling the angle parameter results in values falling outside of the allowable range. The allowable range determines the extent of the left reference samples.
Increasing the size of the left reference samples (e.g., 806) results in using samples that are spatially quite distant from the prediction unit (PU) for intra-prediction. The samples used would not be expected to be correlated with the contents of the prediction unit (PU) and thus do not contribute to coding efficiency. Instead, adjustment of the angle parameter and inverse angle is possible.
The method 1100 continues at an angle parameter exceeds maximum test step 1116, where the processor 205 is used to test if the angle parameter (after doubling) is greater than thirty two (32). Alternatively, at step 1116, the processor 205 may test if the angle parameter is greater than sixteen (16) before doubling, which produces an equal result to test if the angle parameter (after doubling) is greater than thirty two (32).
Cases where the maximum angle parameter is exceeded at step 1116 correspond to predominantly horizontal intra-prediction modes two (2) to five (5) and control passes to an adjust angle step 1118. At adjust angle step 1118, the processor 205 is used to set the angle parameter to thirty two (32) and the inverse angle to negative two hundred and fifty six (−256). In cases where the condition of step 1116 is not met, control passes to an angle parameter below minimum test step 1120.
The double angle step 1112, the quantise angle step 1114 and the steps 1116 and 1118 may be combined into a single table look-up operation for reduced complexity. For example, the combined steps 1112 and 1114 may map angle parameters from [0, 2, 5, 9, 13, 17, 21, 26, 32] to [0, 5, 9, 17, 26, 32, 32, 32, 32] and may map inverse angles from [−256, −315, −390, −482, −630, −910, −1638, −4096] to [−256, −256, −256, −256, −315, −482, −910, −1638].
Other arrangements may produce different mappings due to rounding differences during the halving of the inverse angle. Arrangements which produce different mappings due to rounding differences during the halving of the inverse angle produce different output from the quantisation but retain the property of using only pre-existing values for the angle parameter and inverse angle. For example, some arrangements may always quantise angle parameters downwards and inverse angle parameters upwards and vice versa.
At the angle parameter below minimum test step 1120, the processor 205 is used to test if the angle parameter (after doubling) is lower than negative thirty two (−32). Alternatively, the step 1120 may also test if the angle parameter is less than sixteen 16 prior to doubling, which produces an equal result to testing if the angle parameter (after doubling) is lower than negative thirty two (32). Cases where the low threshold exceeded condition is met correspond to predominantly horizontal intra-prediction modes fifteen (15) to seventeen (17). If the low threshold is exceeded, then the method 1100 proceeds to adjust angle and direction step 1122. Otherwise, the method 1100 proceeds to step 1124.
At step 1122, the processor 205 is used for adjusting the angle parameter, the inverse angle and the threshold. The step 1122 sets the angle parameter and inverse angle to correspond to the angle parameter and inverse angle of intra-prediction modes eighteen (18) to twenty (20). The step 1122 also adjusts the threshold from eighteen (18) to fifteen (15). The intra-prediction modes eighteen (18) to twenty (20) are predominantly vertical and thus the adjusted threshold results in the direction of the intra-prediction modes fifteen (15), sixteen (16) and seventeen (17) changing from predominantly horizontal to predominantly vertical.
As described above, the change threshold indicates the boundary between the predominantly horizontal intra-prediction modes and the predominantly vertical intra-prediction modes. The change threshold has a value of eighteen (18) for the luma sample grid (and for a chroma channel when a chroma format other than 4:2:2 is in use). For a chroma channel when a 4:2:2 chroma format is in use, the change threshold has a value of fifteen (15). As also described above, the change threshold may be modified if the magnitude of the adjusted intra-predication angle parameter exceeds a predetermined value (e.g., 32).
At generate reference samples step 1124, the reference sample generator 1008, under execution of the processor 205, generates reference samples using the angle parameter and inverse angle (if defined). The adjusted threshold may also be used for generating reference samples.
Then at generate intra-predicted samples step 1126, the sample block generator 1010, under execution of the processor 205, is used for generating prediction unit (PU) 464, in the form of an array of intra-predicted samples, using the reference samples determined at step 1124 and the angle parameter 1024. The intra-predicted samples are generated at step 1126 using the vertical intra-prediction mode according to the intra-prediction angle adjusted at either of steps 1118 and 1122, and according to the change threshold described above.
Arrangements that accord with either of the tables of
Arrangements that accord with any of the tables of
In one arrangement, the intra-prediction mode may be remapped to an adjusted intra-prediction mode, prior to determining the angle parameter and the inverse angle. The adjusted intra-prediction mode is used in the steps 1104-1126 of the method 1100. Remapping the intra-prediction mode to an adjusted intra-prediction mode introduces an intermediate (i.e. ‘adjusted’) intra-prediction mode that results from the remapping operation. Remapping the intra-prediction mode to an adjusted intra-prediction mode also introduces an additional table to remap the intra-prediction mode to an adjusted intra-prediction mode and an additional table look-up is required to perform the remapping (e.g., as part of the step 1102). In arrangements where the intra-prediction mode is remapped to an adjusted intra-prediction mode, further adjustment of the angle parameter and the inverse angle is not required and thus the steps 1105-1122 are not required.
Arrangements described herein permit implementations of the video encoder 114 and the video decoder 134 to support the 4:2:2 chroma format with reduced complexity, while maintaining high coding efficiency for intra-predicted prediction units (PUs).
The arrangements described are applicable to the computer and data processing industries and particularly for the digital signal processing for the encoding a decoding of signals such as video signals.
The foregoing describes only some embodiments of the present invention, and modifications and/or changes can be made thereto without departing from the scope and spirit of the invention, the embodiments being illustrative and not restrictive.
In the context of this specification, the word “comprising” means “including principally but not necessarily solely” or “having” or “including”, and not “consisting only of”. Variations of the word “comprising”, such as “comprise” and “comprises” have correspondingly varied meanings.
Number | Date | Country | Kind |
---|---|---|---|
2013202653 | Apr 2013 | AU | national |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/AU2014/000367 | 4/4/2014 | WO | 00 |