This invention relates generally to processing video.
Because of the need to transmit large amounts of data containing detailed information, it is desired to conserve the available bandwidth of transport media. To this end, video information may be compressed using a variety of well known compression techniques. Received video in compressed format may be decompressed. As a result, the video may be transmitted more compactly, enabling lower bandwidth transport media to be utilized while conserving the bandwidth of higher bandwidth transport media.
Several compression standards require a two-dimensional transformation of the data. This transformation is generally performed in one dimension at a time, with intermediate results stored in a transpose buffer or transpose random access memory (RAM). 8×8 blocks of video information called pels may be processed as atomic units, or may be divided into 4×8, 8×4, or 4×4 sub-blocks for processing.
Thus, blocks of video data may be stored in transpose buffers in the course of coding and decoding. In some compression standards (e.g., Moving Pictures Experts Group (ISO/IEC 13818) (MPEG-2)) only 8×8 blocks are processed. In others (e.g., Microsoft Windows Media® 9) some 8×8 blocks may be replaced by two 4×8 sub-blocks, two 8×4 sub-blocks, or four 4×4 sub-blocks.
In some embodiments of the present invention, a transpose buffer may be used in connection with video compression and decompression. The transpose buffer may be written to and read from in connection with one-dimensional compression transforms performed in sequence. The transpose buffer may be managed to most effectively and efficiently buffer the compression information in some embodiments. Although in general the transpose buffer is an ordinary 64-word RAM with linear addressing, it is convenient to think of the RAM locations as occupying positions in a two-dimensional array as shown in
Consider the case in which a series of 8×8 blocks is to be processed. The first block may be written column-wise and read row-wise. The second block may be written column-wise as well, but then the first column cannot be written until 57 words of the first block have been read (the first 7 rows and the first word of the last row). This imposes a serious limitation on processing throughput. Recognizing however that it makes no difference whether we write column-wise or row-wise so long as we read row-wise or column-wise respectively, the second block may be written row-wise and read column-wise. Then, the first row of the second block may be written after only eight words of the first block have been read. This may result in a very substantial throughput improvement in some embodiments.
A complication arises when a block is divided into a set of sub-blocks. There is no unique optimal order for writing and reading in this case, but following some general principles may maximize throughput and simplify addressing in some cases:
1) Write and read order may be toggled from column-wise to row-wise or vice versa after a complete block (not a sub-block) has been written or read.
2) When writing column-wise, each sub-block may completely fill n rows, where n=2 for 4×4 sub-blocks and 4 for 4×8 or 8×4 sub-blocks. Similarly, when writing row-wise, each sub-block may completely fill n columns, where n=2 or 4.
3) When writing column-wise, addressing may be such that the first vector(s) (one or two) that will be read occupy the first buffer row of the sub-block. For example, a 4×4 sub-block can be written to the following addresses:
Note that the first two vectors to be read occupy addresses 0, 8, 10, 18 and 20, 28, 30, 38, which is the first row of the buffer. This row is thus cleared as quickly as possible for the next block. Similarly, when writing, row-wise addressing may be such that the first vector(s) (one or two) that are read occupy the first buffer column of the sub-block.
Referring to
The Windows Media® 9 transform is a two-dimensional transform similar in principle to a discrete cosine transform (DCT). Like the DCT, the Windows Media® 9 inverse transform is separable, meaning that the Windows Media® 9 inverse transform can be decomposed into two one-dimensional (1D) transforms performed in sequence.
Referring to
The video codec 28 may handle video processing in general, including compression and decompression. The decoder/coder 28 may include a Moving Pictures Experts Group (MPEG) and Windows Media® 9 (WM9) coder and decoder 30 (see
In some embodiments, the system 10 may be a set top box. The present invention is no way limited to the particular architecture described above and shown in
Referring to
More particularly, the current 8×8 pel microblock 60 and a prediction 62 are received and their difference determined at 65 for motion compensation. The transform engine 64 then works in two passes. In the first pass, the transform engine 64 operates column-wise and writes the results of the first one-dimensional operation into the transpose buffer 68 via the demultiplexer 66. Then, the transform engine 64 fetches the columns from the transpose buffer 68 to do the second pass. Control logic or software 38 within the transpose buffer 68 may enable matrix transpose operations between the first and second passes. Then, the results from the second pass are passed on to the quantization and coding and decoding stages 76. A compressed block may result. Also, a compressed block may be received and decompressed by inverse quantization 70, demultiplexing 72, and the inverse transform engine 74.
Referring to
While an embodiment using a Windows Media® 9 transform is described, other transforms may also be used, including discrete cosine transforms and the like, such as Moving Picture Experts Group (ISO/IEC 13818) and VC-1 Society of Motion Picture Television Engineers(SMPTE) transforms.
Referring to
A check at diamond 90 determines whether the last word of the block has been written. If so, a check at diamond 92 determines whether that block is the last block to be written. If not, the write order is toggled from column to row or vice versa as indicated in block 94. If so, the process ends.
Referring to
References throughout this specification to “one embodiment” or “an embodiment” mean that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one implementation encompassed within the present invention. Thus, appearances of the phrase “one embodiment” or “in an embodiment” are not necessarily referring to the same embodiment. Furthermore, the particular features, structures, or characteristics may be instituted in other suitable forms other than the particular embodiment illustrated and all such forms may be encompassed within the claims of the present application.
While the present invention has been described with respect to a limited number of embodiments, those skilled in the art will appreciate numerous modifications and variations therefrom. It is intended that the appended claims cover all such modifications and variations as fall within the true spirit and scope of this present invention.