Techniques and tools for video acceleration of in-loop filtering for interlaced video frames are described herein. Efficient, concise protocols for in-loop filter control information, for example, simplify implementation in encoders, decoders and video accelerators, and reduce the amount of control information that is signaled. In some cases, the protocols reuse the syntax and/or data structures from filter control information for progressive video frames, which further simplifies implementation.
Various alternatives to the implementations described herein are possible. For example, certain techniques described with reference to flowchart diagrams can be altered by changing the ordering of stages shown in the flowcharts, by repeating or omitting certain stages, etc., while achieving the same result. As another example, although some implementations are described with reference to specific macroblock formats, other formats also can be used. Different embodiments implement one or more of the described techniques and tools. Some of the techniques and tools described herein address one or more of the problems noted in the Background. Typically, a given technique/tool does not solve all such problems, however.
With reference to
A computing environment may have additional features. For example, the computing environment (600) includes storage (640), one or more input devices (650), one or more output devices (660), and one or more communication connections (670). An interconnection mechanism (not shown) such as a bus, controller, or network interconnects the components of the computing environment (600). Typically, operating system software (not shown) provides an operating environment for other software executing in the computing environment (600), and coordinates activities of the components of the computing environment (600).
The storage (640) may be removable or non-removable, and includes magnetic disks, magnetic tapes or cassettes, CD-ROMs, DVDs, or any other medium which can be used to store information and which can be accessed within the computing environment (600). The storage (640) stores instructions for the software (680).
The input device(s) (650) may be a touch input device such as a keyboard, mouse, pen, or trackball, a voice input device, a scanning device, or another device that provides input to the computing environment (600). For audio or video encoding, the input device(s) (650) may be a sound card, video card, TV tuner card, or similar device that accepts audio or video input in analog or digital form, or a CD-ROM or CD-RW that reads audio or video samples into the computing environment (600). The output device(s) (660) may be a display, printer, speaker, CD-writer, or another device that provides output from the computing environment (600).
The communication connection(s) (670) enable communication over a communication medium to another computing entity. The communication medium conveys information such as computer-executable instructions, audio or video input or output, or other data in a modulated data signal. A modulated data signal is a signal that has one or more of its characteristics set or changed in such a manner as to encode information in the signal. By way of example, and not limitation, communication media include wired or wireless techniques implemented with an electrical, optical, RF, infrared, acoustic, or other carrier.
The techniques and tools can be described in the general context of computer-readable media. Computer-readable media are any available media that can be accessed within a computing environment. By way of example, and not limitation, with the computing environment (600), computer-readable media include memory (620), storage (640), communication media, and combinations of any of the above.
The techniques and tools can be described in the general context of computer-executable instructions, such as those included in program modules, being executed in a computing environment on a target real or virtual processor. Generally, program modules include routines, programs, libraries, objects, classes, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The functionality of the program modules may be combined or split between program modules as desired in various embodiments. Computer-executable instructions for program modules may be executed within a local or distributed computing environment.
For the sake of presentation, the detailed description uses terms like “decide,” “make” and “get” to describe computer operations in a computing environment. These terms are high-level abstractions for operations performed by a computer, and should not be confused with acts performed by a human being. The actual computer operations corresponding to these terms vary depending on implementation.
The relationships shown between modules within the decoder (700) indicate general flows of information in the decoder; other relationships are not shown for the sake of simplicity. In particular, while a decoder host performs some operations of modules of the decoder (700), a video accelerator performs other operations (such as inverse frequency transforms, fractional sample interpolation, motion compensation, in-loop deblocking filtering, color conversion, post-processing filtering and/or picture re-sizing). For example, the decoder (700) passes instructions Acceleration API/DDI,” version 1.01. Alternatively, the decoder (700) passes instructions and information to the video accelerator using another mechanism, such as one described in a later version of DXVA or another acceleration interface. In general, once the video accelerator reconstructs video information, it maintains some representation of the video information rather than passing information back. For example, after a video accelerator reconstructs an output picture, the accelerator stores it in a picture store, such as one in memory associated with a GPU, for use as a reference picture. The accelerator then performs in-loop deblock filtering and fractional sample interpolation on the picture in the picture store.
In some implementations, different video acceleration profiles result in different operations being offloaded to a video accelerator. For example, one profile may only offload out-of-loop, post-decoding operations, while another profile offloads in-loop filtering, fractional sample interpolation and motion compensation as well as the post-decoding operations. Still another profile can further offload frequency transform operations. In still other cases, different profiles each include operations not in any other profile.
Returning to
The decoder (700) receives information (795) for a compressed sequence of video pictures and produces output including a reconstructed picture (705) (e.g., progressive video frame, interlaced video frame, or field of an interlaced video frame). The decoder system (700) decompresses predicted pictures and key pictures. For the sake of presentation,
A demultiplexer (790) receives the information (795) for the compressed video sequence and makes the received information available to the entropy decoder (780). The entropy decoder (780) entropy decodes entropy-coded quantized data as well as entropy-coded side information, typically applying the inverse of entropy encoding performed in the encoder. A motion compensator (730) applies motion information (715) to one or more reference pictures (725) to form motion-compensated predictions (735) of subblocks, blocks and/or macroblocks of the picture (705) being reconstructed. One or more picture stores store previously reconstructed pictures for use as reference pictures.
The decoder (700) also reconstructs prediction residuals. An inverse quantizer (770) inverse quantizes entropy-decoded data. An inverse frequency transformer (760) converts the quantized, frequency domain data into spatial domain video information. For example, the inverse frequency transformer (760) applies an inverse block transform to subblocks and/or blocks of the frequency transform coefficients, producing sample data or prediction residual data for key pictures or predicted pictures, respectively. The inverse frequency transformer (760) may apply an 8×8, 8×4, 4×8, 4×4, or other size inverse frequency transform.
For a predicted picture, the decoder (700) combines reconstructed prediction residuals (745) with motion compensated predictions (735) to form the reconstructed picture (705). A motion compensation loop in the video decoder (700) includes an adaptive deblocking filter (723). The decoder (700) applies in-loop filtering (723) to the reconstructed picture to adaptively smooth discontinuities across block/subblock boundary rows and/or columns in the picture. The decoder stores the reconstructed picture in a picture buffer (720) for use as a possible reference picture. For example, the decoder (700) performs in-loop deblock filtering operations as described in U.S. Patent Application Publication No. US-2005-0084012-A1, entitled “IN-LOOP DEBLOCKING FOR INTERLACED VIDEO.” Alternatively, the decoder (700) performs in-loop deblock filtering operations using another mechanism.
Depending on implementation and the type of compression desired, modules of the decoder can be added, omitted, split into multiple modules, combined with other modules, and/or replaced with like modules. In alternative embodiments, encoders or decoders with different modules and/or other configurations of modules perform one or more of the described techniques. Specific embodiments of video decoders typically use a variation or supplemented version of the generalized decoder (700).
In-loop filtering operations for interlaced video content are typically different, and more complex, than in-loop filtering operations for progressive video content. In some implementations, aside from the use of variable transform sizes (such as 8×8, 8×4, 4×8 and 4×4), the macroblocks of an interlaced video frame can be organized as frames or fields for encoding (see
In some embodiments, an encoder/decoder and video accelerator redefine an existing progressive mode protocol for interlaced frame modes. For example, the encoder/decoder and video accelerator use the LOOPF_FLAG structure and syntax described above for signaling purposes but redefine the semantic to suit in-loop filtering for interlaced video frames. The LOOPF_FLAG structure and syntax are thus universal for all frame modes of such a codec. Alternatively, the encoder/decoder and video accelerator use different data structures and syntax to signal in-loop filtering control information in progressive mode, interlaced field mode and/or interlaced frame mode.
A. Example In-loop Filtering Control Information for Interlaced Frames.
For a block of a progressive video frame, the significance of the bits of a LOOPF_FLAG byte is explained above with reference to
With reference to
Bit 1 controls in-loop deblock filtering across the vertical edge at the left side of the 8×8 block (810) for samples of odd-numbered rows (namely, rows 1, 3, 5 and 7). In
Bit 2 controls in-loop deblock filtering across the horizontal edge at the top side of the 8×8 block (810) for samples of even-numbered rows. In
Bit 3 controls in-loop deblock filtering across the horizontal edge at the top side of the 8×8 block (810) for samples of odd-numbered rows. In
Bit 4 controls in-loop deblock filtering across the vertical edge in the middle of the 8×8 block (810) for samples of even-numbered rows (namely, rows 0, 2, 4, and 6). In
Bit 5 controls in-loop deblock filtering across the vertical edge in the middle of the 8×8 block (810) for samples of odd-numbered rows (namely, rows 1, 3, 5 and 7). In
Bit 6 controls in-loop deblock filtering across the horizontal edge in the middle of the 8×8 block (810) for samples of even-numbered rows. In
Bit 7 controls in-loop deblock filtering across the horizontal edge in the middle of the 8×8 block (810) for samples of odd-numbered rows. In
With the protocol described with reference to
B. Signaling In-Loop Filtering Control Information for Interlaced Frames.
The decoder parameterizes (910) one or more in-loop filtering decisions for a macroblock of an interlaced video frame, resulting in in-loop filtering control information for video acceleration. For example, from one or more decisions about which edges of blocks of the macroblock should be filtered, the decoder produces on/off control information for the edges. The control information can follow the protocol explained with reference to
The decoder then makes the control information available (920) to the video accelerator. For example, the decoder writes the control information to a buffer and, if appropriate, calls a method of the video acceleration interface to alert the video accelerator that control information is ready for processing. The video acceleration interface can follow DXVA guidelines or guidelines for another acceleration interface with buffers. Alternatively, the decoder uses a messaging mechanism or some other communications mechanism to make the control information available to the video accelerator.
The decoder makes (1010) one or more in-loop filtering decisions for a macroblock of an interlaced video frame. For example, the decoder applies the filtering decision criteria described in U.S. Patent Application Publication No. US-2005-0084012-A1, entitled “IN-LOOP DEBLOCKING FOR INTERLACED VIDEO.” Alternatively, the decoder makes the filtering decision(s) using other and/or additional criteria.
The decoder parameterizes (1020) the one or more in-loop filtering decisions as in-loop filtering control information for video acceleration. For example, from the one or more decisions, the decoder produces on/off control information indicating which edges of blocks of the macroblock should be filtered, following the protocol explained with reference to
The decoder buffers (1030) the control information. For example, the decoder writes the control information to a buffer that the decoder has reserved. The buffer may include other in-loop filtering control information and/or control information for other operations offloaded to the video accelerator. In some implementations, the decoder writes control information for a macroblock (e.g., macroblock parameter information indicating intra/inter status, frame/field status, macroblock type, etc., motion vector information such as number of motion vectors, information indicating which residuals have associated coefficient information in the bit stream) to the buffer, then writes the in-loop filtering control information to the buffer, then writes any residual or other transform coefficient data to a residual data buffer. Alternatively, the decoder uses more buffers (e.g., separate buffer for motion vector information) or fewer buffers for the control information.
The decoder then decides (1040) whether it should call a method of the video acceleration interface. If so, the decoder calls (1050) the method of the acceleration interface. Otherwise, the decoder continues with the next macroblock. In some implementations, for example, the decoder calls the method only after all of the control information and other information for a picture has been buffered. The decoder buffers picture parameters and buffers macroblock control information for the respective macroblocks, then calls the method when the information for the last macroblock (and its blocks) has been buffered. Alternatively, the decoder calls the method of the acceleration interface at some other interval, for example, on a slice-by-slice basis.
The decoder determines (1060) whether it is done and, if so, finishes. Otherwise, the decoder continues with the next macroblock. For example, the decoder determines whether there is another picture in a sequence to process, another slice in a picture to process, and so on.
C. Transferring In-Loop Filtering Control Information for Interlaced Frames.
At some point prior to decoding, the operating system assists (1110) in the installation of a video decoder. For example, the operating system incorporates information for the video decoder in a system registry, exposes access to the video decoder through a menu and/or icons on a user interface, registers the decoder as an available decoder on the system, associates content types with the decoder, and/or helps the decoder negotiate capabilities with a video accelerator.
After decoding starts (1120), the operating system receives (1130) control information and other information in one or more buffers, including in-loop filtering control information, and invokes (1140) a method of an interface of a video accelerator. For example, a decoder writes the control information for a picture in buffer(s) as described above with reference to
The operating system determines (1150) whether it is done and, if so, finishes. Otherwise, the operating system waits, receiving (1130) information in the buffer (or a different buffer) and invoking (1140) the method of the video accelerator at appropriate times.
For the sake of simplicity,
D. Processing In-Loop Filtering Control Information for Interlaced Frames.
The video accelerator gets (1210) in-loop filtering control information that parameterizes one or more in-loop filtering decisions for a macroblock of an interlaced video frame. For example, the video accelerator reads the control information from a buffer when the video accelerator is alerted that control information is ready for processing. The video accelerator can receive the notification as a call to a method exposed through a DDI, according to a video acceleration interface that follows DXVA guidelines or guidelines for another acceleration interface with buffers. Alternatively, the video accelerator uses a messaging mechanism or some other communications mechanism to get the control information. The control information can follow the protocol explained with reference to
The video accelerator next performs (1220) in-loop filtering for the macroblock according to the control information. For example, for edges of the macroblock that are to be filtered, the video accelerator performs the filtering as described in U.S. Patent Application Publication No. US-2005-0084012-A1, entitled “IN-LOOP DEBLOCKING FOR INTERLACED VIDEO.” Alternatively, the video accelerator performs the filtering using other filtering rules.
Having described and illustrated the principles of our invention with reference to various embodiments, it will be recognized that the various embodiments can be modified in arrangement and detail without departing from such principles. It should be understood that the programs, processes, or methods described herein are not related or limited to any particular type of computing environment, unless indicated otherwise. Various types of general purpose or specialized computing environments may be used with or perform operations in accordance with the teachings described herein. Elements of embodiments shown in software may be implemented in hardware and vice versa.
In view of the many possible embodiments to which the principles of the disclosed invention may be applied, it should be recognized that the illustrated embodiments are only preferred examples of the invention and should not be taken as limiting the scope of the invention. Rather, the scope of the invention is defined by the following claims. We therefore claim as our invention all that comes within the scope and spirit of these claims.