1. Technical Field of the Invention
The invention relates generally to digital video processing; and, more particularly, it relates to performing decoding side intra-prediction derivation in accordance with such digital video processing.
2. Description of Related Art
Communication systems that operate to communicate digital media (e.g., images, video, data, etc.) have been under continual development for many years. With respect to such communication systems employing some form of video data, a number of digital images are output or displayed at some frame rate (e.g., frames per second) to effectuate a video signal suitable for output and consumption. Within many such communication systems operating using video data, there can be a trade-off between throughput (e.g., number of image frames that may be transmitted from a first location to a second location) and video or image quality of the signal eventually to be output or displayed. The present art does not adequately or acceptably provide a means by which video data may be transmitted from a first location to a second location in accordance with providing an adequate or acceptable video or image quality, ensuring a relatively low amount of overhead associated with the communications, etc.
The present invention is directed to apparatus and methods of operation that are further described in the following Brief Description of the Several Views of the Drawings, the Detailed Description of the Invention, and the claims. Other features and advantages of the present invention will become apparent from the following detailed description of the invention made with reference to the accompanying drawings.
Within many devices that use digital media such as digital video, respective images thereof, being digital in nature, are represented using pixels. Within certain communication systems, digital media can be transmitted from a first location to a second location at which such media can be output or displayed. The goal of digital communications systems, including those that operate to communicate digital video, is to transmit digital data from one location, or subsystem, to another either error free or with an acceptably low error rate. As shown in
Referring to
To reduce transmission errors that may undesirably be incurred within a communication system, error correction and channel coding schemes are often employed. Generally, these error correction and channel coding schemes involve the use of an encoder at the transmitter end of the communication channel 199 and a decoder at the receiver end of the communication channel 199.
Any of various types of ECC codes described can be employed within any such desired communication system (e.g., including those variations described with respect to
Generally speaking, when considering a communication system in which video data is communicated from one location, or subsystem, to another, video data encoding may generally be viewed as being performed at a transmitting end of the communication channel 199, and video data decoding may generally be viewed as being performed at a receiving end of the communication channel 199.
Also, while the embodiment of this diagram shows bi-directional communication being capable between the communication devices 110 and 120, it is of course noted that, in some embodiments, the communication device 110 may include only video data encoding capability, and the communication device 120 may include only video data decoding capability, or vice versa (e.g., in a uni-directional communication embodiment such as in accordance with a video broadcast embodiment).
Referring to the communication system 200 of
Within each of the transmitter 297 and the receiver 298, any desired integration of various components, blocks, functional blocks, circuitries, etc. Therein may be implemented. For example, this diagram shows a processing module 280a as including the encoder and symbol mapper 220 and all associated, corresponding components therein, and a processing module 280 is shown as including the metric generator 270 and the decoder 280 and all associated, corresponding components therein. Such processing modules 280a and 280b may be respective integrated circuits. Of course, other boundaries and groupings may alternatively be performed without departing from the scope and spirit of the invention. For example, all components within the transmitter 297 may be included within a first processing module or integrated circuit, and all components within the receiver 298 may be included within a second processing module or integrated circuit. Alternatively, any other combination of components within each of the transmitter 297 and the receiver 298 may be made in other embodiments.
As with the previous embodiment, such a communication system 200 may be employed for the communication of video data is communicated from one location, or subsystem, to another (e.g., from transmitter 297 to the receiver 298 via the communication channel 299).
Digital image processing of digital images (including the respective images within a digital video signal) may be performed by any of the various devices depicted below in
In the current video coding standard AVC/H.264 (as described in reference [1]), each respective pixel composing a picture (image) is predicted through either inter or intra-prediction.
The intra-prediction processing approach described herein and in accordance with the principles described herein involves creating a prediction of pixels within a current block of pixels using neighboring pixels from that very same picture (image). For comparison, inter prediction (as opposed intra-prediction) involves predicting pixels of the current block using pixels from previously decoded pictures (images) and not those within the very same picture (image) (e.g., motion compensated prediction).
For luma intra-prediction, pixels within respective blocks of a picture (image) can be predicted using blocks of pixels including 4×4 and 8×8 sized prediction blocks as shown in Table 1, or 16×16 sized prediction blocks as shown in Table 2, or generally N×N (where N is an integer) sized prediction blocks. Analogously, chroma blocks may be predicted on 8×8 sized prediction blocks as shown in Table 3 or generally M×M (where M is an integer) prediction blocks.
As shown below within Table 1, both 4×4 and 8×8 luma prediction use one of nine different prediction modes (shown as varying from 0 to 8). Other than DC prediction, each of these respective modes can be considered a directional propagation of the top and/or left pixels to fill the current block and associated with a corresponding prediction vector.
In one embodiment, for each respective block, the encoder (e.g., included within the source or transmitting device from which the video data is being sent) must select an operational mode and transmit these selected operational modes to the decoder (e.g., included within the recipient or receiving device to which the video data is sent for use in being output or displayed). The number of bits needed to transmit this associated information (e.g., overhead corresponding to the selected operational modes) can be significant and can compete with the throughput of the actual video data from the encoder to the decoder. This can be especially deleterious when coding efficiency is a primary concern. That is to say, every non-video data bit that must be transmitted from the encoder to the decoder competes with the video data itself, and can reduce the overall coding efficiency of the communication system operating to communicate the video data from a first location to a second location. Presently, video coding experts are working together on a more advanced standard called HEVC (or MPEG-H) aiming for 50% efficiency improvement relative to AVC/H.264. There are several new intra-prediction methods that have been proposed during this development. Two of those methods have been documented (as described in reference [2]) and implemented in the HEVC software model (as described in reference [3]). While these attempts do have some promising advantages, one being the extension of the number of intra-prediction directions of AVC/H.264 from 9 modes to 33 or more different modes, they nonetheless incur a significant increase in overhead. That is to say, by adding this large number of new operational modes (e.g., increasing from 9 to 33), the number of bits necessary to describe the intra-prediction direction for a given block (or prediction unit (PU)) will also be significantly increased thereby also reducing the overall coding efficiency of the communication system operating to communicate the video data from a first location to a second location.
While on one hand, including more directional modes could in fact help improve prediction, the significant increase in overhead required to transmit all of these additional overhead bits to code these extra intra-prediction modes can directly compete with the overall coding efficiency of the communication system. As such, it can be seen that introducing additional control or overhead bits competes with actual video data to be transmitted from a first location to a second location for output or display.
As described above, one approach for performing communication of video data from a first location or device including an encoder to a second location or device is for the encoder to select the intra-prediction mode and communicate this information to the decoder using extra bits that cut into coding efficiency (e.g., compete with the available bandwidth that may alternatively be used to communicate actual video data). At the receiving end of the communication channel, the decoder does relatively little in comparison and simply uses the selected mode (as received from the encoder) to generate its prediction.
However, as a digital picture (image) is decoded on the receiving end, the decoder has a great deal of information within that very same digital picture (image) from previously decoded pixels located to the left and/or above the current block. As may be seen herein, a decoder can use this just decoded information from within a given digital picture (image) to search on its own for the most likely intra-prediction direction without necessitating such direction and control the encoder. For example,
A processing module, such as included within a device operative to perform video and/or image processing, can operate by calculating a first plurality of pixels (e.g., current region) using a plurality of prediction vectors extending into the first plurality of pixels (e.g., current region) from a second plurality of pixels located to the left and/or above the first plurality of pixels (e.g., current region) within an image of the video data. As may be seen, just decoded pixels within a same picture (image) may be used to decode other pixels within that very same picture (image). The processing module may then operate by constructing the image of the video data using both the first plurality of pixels (e.g., pixels within the current region that have been most recently decoder) and the second plurality of pixels (e.g., just decoded pixels within that very same picture (image) to the left and/or above).
In many cases, the spatial redundancy that may exist within a coded picture (image) is significant enough to allow a relatively accurate guess at which prediction direction or which prediction vector is most likely going to work well in the next ‘current’ block to be decoded. A decoder can perform this task and reduce or eliminate the need for the encoder to send so many intra-prediction mode bits (e.g., as control or overhead bits).
In all such cases, the encoder can perform the exact same or similar search as the decoder in order to know exactly which direction the decoder will ultimately select. Within such systems, it is noted that both the encoder and decoder should be in synchronization and be on the same page in this respect so that they both generate identical predicted pixels.
With respect to performing communication of video data from a first location or device to a second location or device, there are at least three approaches to be considered in accordance with the various aspects and principles of the invention.
Also, because the decoder will be making some or all of the intra-prediction mode decisions, there is no real reason to restrict the decoder to select only a single direction per block. As mentioned above, within a given block or group of pixels, each respective prediction vector may be employed for calculating a respective one pixel. Alternatively, each respective prediction vector may be employed for calculating more than one respective pixel (e.g., a single prediction vector may be employed for prediction vector a plurality of pixels). In accordance with the AVC/H.264 and the proposals for HEVC/MPEG-H, only one mode per block is use because it is too expensive to explicitly transmit modes (e.g., overhead) for each individual pixel.
However, if these modes are instead determined on the decoder side using DIPD in accordance with the various aspects and principles of the invention, each pixel or region could be predicted along independent directions (e.g., such as in accordance with the embodiment depicted in
It is also noted that the shapes of such ‘current regions’ and the pixels associated therewith can have any desired shape including having pixels that are non-contiguous with respect to each other. For example, a ‘current region’ may be composed of any desired shape such as a checker board shape, non-contiguous or non-adjacent respective groups of pixels, etc.
Referring to method 900 of
Referring to method 901 of
The method 901 continues by refining the coarse intra-prediction mode information in accordance with determining each respective prediction vector of a subset of a plurality of prediction vectors corresponding to the subset of the plurality of pixels, as shown in a block 931.
Referring to method 1000 of
The method 1000 then operates by employing the refined intra-prediction mode information in accordance with determining each respective prediction vector of the subset of the plurality of prediction vectors corresponding to the subset of the plurality of pixels, as shown in a block 1030.
Referring to method 1001 of
Referring to method 1100 of
Referring to method 1101 of
It is noted that the various modules and/or circuitries (e.g., for video processing, encoding and/or decoding, etc.) described herein may be a single processing device or a plurality of processing devices. Such a processing device may be a microprocessor, micro-controller, digital signal processor, microcomputer, central processing unit, field programmable gate array, programmable logic device, state machine, logic circuitry, analog circuitry, digital circuitry, and/or any device that manipulates signals (analog and/or digital) based on operational instructions. The operational instructions may be stored in a memory. The memory may be a single memory device or a plurality of memory devices. Such a memory device may be a read-only memory (ROM), random access memory (RAM), volatile memory, non-volatile memory, static memory, dynamic memory, flash memory, and/or any device that stores digital information. It is also noted that when the processing module implements one or more of its functions via a state machine, analog circuitry, digital circuitry, and/or logic circuitry, the memory storing the corresponding operational instructions is embedded with the circuitry comprising the state machine, analog circuitry, digital circuitry, and/or logic circuitry. In such an embodiment, a memory stores, and a processing module coupled thereto executes, operational instructions corresponding to at least some of the steps and/or functions illustrated and/or described herein.
It is also noted that any of the connections or couplings between the various modules, circuits, functional blocks, components, devices, etc. within any of the various diagrams or as described herein may be differently implemented in different embodiments. For example, in one embodiment, such connections or couplings may be direct connections or direct couplings there between. In another embodiment, such connections or couplings may be indirect connections or indirect couplings there between (e.g., with one or more intervening components there between). Of course, certain other embodiments may have some combinations of such connections or couplings therein such that some of the connections or couplings are direct, while others are indirect. Different implementations may be employed for effectuating communicative coupling between modules, circuits, functional blocks, components, devices, etc. without departing from the scope and spirit of the invention.
As one of average skill in the art will appreciate, the term “substantially” or “approximately”, as may be used herein, provides an industry-accepted tolerance to its corresponding term. Such an industry-accepted tolerance ranges from less than one percent to twenty percent and corresponds to, but is not limited to, component values, integrated circuit process variations, temperature variations, rise and fall times, and/or thermal noise. As one of average skill in the art will further appreciate, the term “operably coupled”, as may be used herein, includes direct coupling and indirect coupling via another component, element, circuit, or module where, for indirect coupling, the intervening component, element, circuit, or module does not modify the information of a signal but may adjust its current level, voltage level, and/or power level. As one of average skill in the art will also appreciate, inferred coupling (i.e., where one element is coupled to another element by inference) includes direct and indirect coupling between two elements in the same manner as “operably coupled”. As one of average skill in the art will further appreciate, the term “compares favorably”, as may be used herein, indicates that a comparison between two or more elements, items, signals, etc., provides a desired relationship. For example, when the desired relationship is that signal 1 has a greater magnitude than signal 2, a favorable comparison may be achieved when the magnitude of signal 1 is greater than that of signal 2 or when the magnitude of signal 2 is less than that of signal 1.
Various aspects of the present invention have also been described above with the aid of method steps illustrating the performance of specified functions and relationships thereof. The boundaries and sequence of these functional building blocks and method steps have been arbitrarily defined herein for convenience of description. Alternate boundaries and sequences can be defined so long as the specified functions and relationships are appropriately performed. Any such alternate boundaries or sequences are thus within the scope and spirit of the claimed invention.
Various aspects of the present invention have been described above with the aid of functional building blocks illustrating the performance of certain significant functions. The boundaries of these functional building blocks have been arbitrarily defined for convenience of description. Alternate boundaries could be defined as long as the certain significant functions are appropriately performed. Similarly, flow diagram blocks may also have been arbitrarily defined herein to illustrate certain significant functionality. To the extent used, the flow diagram block boundaries and sequence could have been defined otherwise and still perform the certain significant functionality. Such alternate definitions of both functional building blocks and flow diagram blocks and sequences are thus within the scope and spirit of the claimed invention.
One of average skill in the art will also recognize that the functional building blocks, and other illustrative blocks, modules and components herein, can be implemented as illustrated or by discrete components, application specific integrated circuits, processors executing appropriate software and the like or any combination thereof.
Moreover, although described in detail for purposes of clarity and understanding by way of the aforementioned embodiments, various aspects of the present invention are not limited to such embodiments. It will be obvious to one of average skill in the art that various changes and modifications may be practiced within the spirit and scope of the invention, as limited only by the scope of the appended claims.
The present U.S. Utility Patent Application claims priority pursuant to 35 U.S.C. §119(e) to the following U.S. Provisional Patent Application which is hereby incorporated herein by reference in its entirety and made part of the present U.S. Utility Patent Application for all purposes: 1. U.S. Provisional Application Ser. No. 61/408,647, entitled “Decoding side intra-prediction derivation for video coding,” (Attorney Docket No. BP22257), filed Oct. 31, 2010, pending.
Number | Date | Country | |
---|---|---|---|
61408647 | Oct 2010 | US |