The present invention relates in general to video encoding and decoding.
Digital video streams can represent video using a sequence of frames (i.e. still images) that are encoded using planes. An increasing number of applications today make use of digital video compression. Reducing the size of the bitstream reduces the bandwidth required to transmit or store a digital video stream prior to decoding.
Disclosed herein are implementations of systems, methods, and apparatuses for decoding a video signal. One aspect of the disclosed implementations is a method for decoding a video stream having a plurality of frames that include a plurality of blocks. The method includes decoding a first block of the plurality of blocks, wherein the first block is indicative of data associated with a first plane of the video stream, and wherein the first block is at least partially spatially coextensive with a second block of the plurality of blocks that is indicative of data associated with a second plane of the video stream, determining, using a computing device, a first lookup table based on values of spatially coextensive pixels of the first and second planes that are peripheral to the first block and the second block, generating a predicted second block using the first block and the first lookup table, and decoding the second block using the predicted second block.
Another aspect of the disclosed implementations is a method for encoding a video stream having a plurality of frames that include a plurality of blocks. The method includes determining, using a computing device, a first lookup table based on values of spatially coextensive pixels of first and second planes of the video stream that are peripheral to a first block of the plurality of blocks and a second block of the plurality of blocks, wherein the first block is indicative of data associated with the first plane, the second block is indicative of data associated with the second plane and wherein the first block is at least partially spatially coextensive with the second block, generating a predicted second block using the first block and the first lookup table, and encoding the second block using the predicted second block
Another aspect of the disclosed implementations is an apparatus for decoding a frame in a video stream having a plurality of frames that include a plurality of blocks. The apparatus includes a memory and a processor configured to execute instructions stored in the memory to decode a first block of the plurality of blocks, wherein the first block is indicative of data associated with a first plane of the video stream, and wherein the first block is at least partially spatially coextensive with a second block of the plurality of blocks that is indicative of data associated with a second plane of the video stream, determine a first lookup table based on values of spatially coextensive pixels that are peripheral to the first block and the second block, generate a predicted second block using the first block and the first lookup table, and decode the second block using the predicted second block.
These and other implementations will be described in additional detail hereafter.
The description herein makes reference to the accompanying drawings wherein like reference numerals refer to like parts throughout the several views, and wherein:
Digital video is used for various purposes including, for example, remote business meetings via video conferencing, high definition video entertainment, video advertisements, and sharing of user-generated videos. As technology is evolving, users have higher expectations for video quality and expect high resolution video even when transmitted over communications channels having limited bandwidth.
Digital video streams can include formats such as VP8 and H.264, including present and future versions thereof. H.264 is also known as MPEG-4 Part 10 or MPEG-4 AVC (formally, ISO/IEC 14496-10).
A frame of a digital video stream can be represented using a plurality of planes, where each plane can represent a component of the frame's video data. Each plane can be comprised of blocks containing pixels arranged as rectangular arrays, for example blocks of 4×4, 8×8, 16×16, or 32×32 pixels. For example, a video stream can be represented by frames having three planes containing the red, blue and green (RGB) color components. Each plane contains blocks representing one of the three color components. Blocks that represent different components of the same spatial location in the frame are referred to as being “spatially coexistent.” In another example, a color video stream can be represented by frames having three planes containing a luminance plane (Y), a first chrominance plane (U) and a second chrominance plane (V). Other color encodings can also be utilized. For example, some video encodings use an alpha or transparency plane along with color planes. In some examples, multi-plane color video representations can be created by transforming one representation into another. For example, an RGB frame can be transformed into a YUV frame by a linear transformation applied to the pixels of the planes.
One step in video encoding can include prediction. Prediction can include intra-plane prediction, where the contents of a second block of pixels from a second plane can be predicted using pixels peripheral to the second block and spatially coextensive pixels peripheral to a first block from a first plane. The peripheral pixels are used to form a lookup table that can be used to transform pixels from a first block from a first plane to form a predicted block to predict the pixel values of the second block. The predicted block can be subtracted from the contents of the second block to form a residual block. The residual block can then be transformed and further encoded to be included in an encoded video bitstream. Blocks to be used for prediction can be encoded and decoded by the encoder prior to being used for prediction in order to more closely match the pixel values to be used during the decoding process. A block encoded using prediction can be represented by fewer bits in the encoded video bitstream than one not encoded using prediction, and can thereby save transmission bandwidth and/or storage space while maintaining similar video quality.
Predicting a second block in a second plane using pixels from a spatially coextensive first block in a first plane of the same frame can be improved by using ways to transform the pixels of the first block into values that more closely approximate the pixels in the second block. Aspects of disclosed implementations can perform this transformation by constructing a lookup table using spatially coextensive pixels peripheral to the first and second blocks. Other ways to effect this transformation could include using linear regression, for example, to calculate a linear relationship between the pixels of the two planes and applying the linear relationship to the pixels of the first block to predict the pixels of the second block. Other ways to implement this transformation can include approximating the transform on a per-pixel basis based on minimizing the residual values following prediction, for example.
The decoding of an encoded video bitstream by a decoder can include inverse transforming a second block from a second plane to form a residual block. The second block can then be predicted using the residual values and a predicted block formed from a spatially coextensive first block from a first plane and a lookup table formed from pixels peripheral to the first and second blocks. The predicted block formed from the first block from the first plane is added to the residual block from the second plane to reconstruct the contents of the second block. Decoding the second block in this manner uses the same data used to predict the block during encoding. Blocks in video streams can be a 4×4 array of pixels, an 8×8 array of pixels, a 16×16 array of pixels, a 32×32 array of pixels, or any other grouping of pixels.
In a YUV representation, the three planes can be sampled at different sampling rates, wherein, for example, the U and/or V planes can contain less data than the Y plane. In an example of this type of representation, the U and V planes can each contain one quarter of the data in the Y plane (e.g., one quarter of the number of pixels).
The disclosed implementations can utilize blocks associated with one or more planes to predict a block associated with another plane. For example, the disclosed implementations can utilize blocks associated with Y to predict blocks associated with U and V, blocks associated with U to predict blocks associated with Y and V, blocks associated with U and V together to predict blocks associated with Y, or can predict blocks using other combinations of planes.
Aspects of disclosed implementations can perform prediction using data associated with a first block in a first plane to predict a second block in a second plane in the same frame along with a lookup table formed from spatially coextensive pixels from the first and second planes. A lookup table can be constructed using spatially coextensive pixels peripheral to the block to be predicted from a second plane and pixels peripheral to a corresponding block associated with a first plane of the frame. Spatially coextensive can refer to pixels having spatially corresponding locations in separate planes. The lookup table can be used to improve the prediction of a block by mapping pixel values from the predicting plane to the plane to be predicted. Pixels from a block associated with a first plane at the location corresponding to the block to be predicted in the second plane can be transformed using the lookup table to form a prediction block. The prediction block formed from the block in the first plane can be subtracted from the block in the second plane to form a residual block. The residual block can be further encoded to be included in the encoded video bitstream. The predicted encoded residual block can result in fewer bits to be included in the encoded video bitstream than a block encoded without using prediction.
The data used to form lookup tables and prediction blocks can be encoded and decoded by the encoder prior to being used in order to more closely match the data to be used by the decoder to perform prediction and thereby improve the accuracy of the result. Aspects of disclosed implementations can compensate for differences in sampling rate between planes by replicating or averaging pixels to up- or down-sample data to match sampling rates. A filter can be applied to up-sampled pixels to improve results.
Aspects of the disclosed implementations include decoding blocks using planes in a similar fashion to the encoding implementations described above. Decoded pixels are used to form a lookup table which uses a decoded block from one or more planes to form a prediction block for another plane. A block is partially decoded to form a residual block which is added to the prediction block to form a decoded block.
A network 128 can connect the transmitting station 112 and a receiving station 130 for encoding and decoding of the video stream. Specifically, the video stream can be encoded in the transmitting station 112 and the encoded video stream can be decoded in the receiving station 130. The network 128 can, for example, be the Internet. The network 128 can also be a local area network (LAN), wide area network (WAN), virtual private network (VPN), or any other means of transferring the video stream from the transmitting station 112.
The receiving station 130, in one example, can be a computer having an internal configuration of hardware such as that described in
Other implementations of the encoding and decoding system 100 are possible. For example, an implementation can omit the network 128. In another implementation, a video stream can be encoded and then stored for transmission at a later time to the receiving station 130 or any other device having memory. In one implementation, the receiving station 130 receives (e.g., via network 128, a computer bus, and/or some communication pathway) the encoded video stream and stores the video stream for later decoding.
The CPU 224 in the computing device 200 can be a conventional central processing unit. Alternatively, the CPU 224 can be any other type of device, or multiple devices, capable of manipulating or processing information now-existing or hereafter developed. Although the disclosed implementations can be practiced with a single processor as shown, e.g. CPU 224, advantages in speed and efficiency can be achieved using more than one processor.
The memory 226 in the computing device 200 can be a random access memory device (RAM). Any other suitable type of storage device can be used as the memory 226. The memory 226 can include code and data 227 that is accessed by the CPU 224 using a bus 230. The memory 226 can further include an operating system 232 and application programs 234, the application programs 234 including programs that permit the CPU 224 to perform the methods described here. For example, the application programs 234 can include applications 1 through N which further include a video communication application that performs the methods described here. The computing device 200 can also include a secondary storage 236, which can, for example, be a memory card used with a mobile computing device 200. Because the video communication sessions may contain a significant amount of information, they can be stored in whole or in part in the secondary storage 236 and loaded into the memory 226 as needed for processing.
The computing device 200 can also include one or more output devices, such as the display 228, which can be a touch sensitive display that combines a display with a touch sensitive element that is operable to sense touch inputs. The display 228 can be coupled to the CPU 224 via the bus 230. Other output devices that permit a user to program or otherwise use the computing device 200 can be provided in addition to or as an alternative to the display 228. When the output device is or includes a display, the display can be implemented in various ways, including by a liquid crystal display (LCD) or a cathode-ray tube (CRT) or light emitting diode (LED) display, such as an OLED display.
The computing device 200 can also include or be in communication with an image-sensing device 238, for example a camera, or any other image-sensing device 238 now existing or hereafter developed that can sense the image of a device user operating the computing device 200. The image-sensing device 238 can be positioned such that it is directed toward a device user that is operating the computing device 200. For example, the position and optical axis of the image-sensing device 238 can be configured such that the field of vision includes an area that is directly adjacent to the display 228, from which the display 228 is visible. The image-sensing device 238 can be configured to receive images, for example, of the face of a device user while the device user is operating the computing device 200.
The computing device 200 can also include or be in communication with a sound-sensing device 240, for example a microphone, or any other sound-sensing device now existing or hereafter developed that can sense the sounds made by the device user operating the computing device 200. The sound-sensing device 240 can be positioned such that it is directed toward the device user operating the computing device 200. The sound-sensing device 240 can be configured to receive sounds, for example, speech or other utterances made by the device user while the device user operates the computing device 200.
Although
When the video stream 350 is presented for encoding, each frame 356 including planes 357 within the video stream 350 can be processed in units of blocks. At the intra/inter prediction stage 472, each block can be encoded using either intra-frame prediction (within a single frame), inter-frame prediction (from frame to frame) or inter-plane prediction (from plane to plane within a single frame). In either case, a prediction block can be formed. In the case of intra-prediction, a prediction block can be formed from samples in the current frame that have been previously encoded and reconstructed. In the case of inter-prediction, a prediction block can be formed from samples in one or more previously constructed reference frames. In the case of correlation-based inter-plane prediction, a prediction block can be formed from samples from a plane or planes other than the plane including the block to be predicted.
Next, still referring to
The quantization stage 476 converts the transform coefficients into discrete quantum values, which are referred to as quantized transform coefficients or quantization levels. The quantized transform coefficients are then entropy encoded by the entropy encoding stage 478. The entropy-encoded coefficients, together with the information used to decode the block, such as the type of prediction used, motion vectors and quantizer value, are then output to the compressed bitstream 488. The compressed bitstream 488 can be formatted using various techniques, such as variable length coding (VLC) or entropy coding.
The reconstruction path in
Other variations of the encoder 470 can be used to encode the compressed bitstream 488. For example, a non-transform based encoder 470 can quantize the residual signal directly without the transform stage 474. In another implementation, an encoder 470 can have the quantization stage 476 and the dequantization stage 480 combined into a single stage.
The decoder 500, similar to the reconstruction path of the encoder 470 discussed above, includes in one example the following stages to perform various functions to produce an output video stream 516 from the compressed bitstream 488: an entropy decoding stage 502, a dequantization stage 504, an inverse transform stage 506, an intra/inter prediction stage 508, a reconstruction stage 510, a loop filtering stage 512 and a deblocking filtering stage 514. Other structural variations of the decoder 500 can be used to decode the compressed bitstream 488.
When the compressed bitstream 488 is presented for decoding, the data elements within the compressed bitstream 488 can be decoded by the entropy decoding stage 502 (using, for example, Context Adaptive Binary Arithmetic Decoding) to produce a set of quantized transform coefficients. The dequantization stage 504 dequantizes the quantized transform coefficients, and the inverse transform stage 506 inverse transforms the dequantized transform coefficients to produce a derivative residual that can be identical to that created by the inverse transform stage 482 in the encoder 470. Using header information decoded from the compressed bitstream 488, the decoder 500 can use the intra/inter prediction stage 508 to create the same prediction block as were created in the encoder 470. At the reconstruction stage 510, the prediction block can be added to the derivative residual to create a reconstructed block. The loop filtering stage 512 can be applied to the reconstructed block to reduce blocking artifacts. The deblocking filtering stage 514 can be applied to the reconstructed block to reduce blocking distortion, and the result is output as the output video stream 516.
Other variations of the decoder 500 can be used to decode the compressed bitstream 488. For example, the decoder 500 can produce the output video stream 516 without the deblocking filtering stage 514.
At step 602, a video stream is received by a computing device, such a transmitting station 112 that implements operation 600. Video data can be received in any number of ways, such as by receiving the video data over a network, over a cable, or by reading the video data from a primary memory or other storage device, including a disk drive or removable media such as a CompactFlash (CF) card, Secure Digital (SD) card, or the like.
At step 604 blocks are encoded. The blocks to be encoded use pixels in a second plane peripheral to a block to be predicted in combination with blocks from a first plane peripheral to the block to be used to determine a lookup table to form the prediction. The relationship between these blocks and the pixels to be used from these blocks to form the lookup tables are shown in
At step 606 the first blocks are decoded, for example by processing with stages 480, 482484 and 486 of encoder 470 and returned to the intra/inter prediction stage 472. At step 608, a lookup table is determined. The term “determine” as used herein means to select, construct, identify, specify or otherwise determine in any manner whatsoever. The lookup table can be determined by selecting first pixel values from pixels peripheral to a block in the first plane. The first pixel values can be used as indices into an empty array. Second pixel values are selected from the second plane from pixels peripheral to the block to be predicted. The second pixel values can be inserted into the lookup table at the indices formed by the spatially coextensive first pixel values. The so determined lookup table can be used to transform pixels from a block associated with the first plane to predict the pixels associated with the second plane
The pixels used to form the lookup table at step 608 can be taken from spatially coextensive pixels peripheral to the block used to form the prediction and the block to be predicted. A row and column of pixels adjacent to the left and top edge of the blocks can be used to form the lookup table. In some cases, more than one row or column of pixels can be used to form more values in the lookup table. More detail regarding which peripheral pixels can be used to form a lookup table is given in relation to
At step 610 the lookup table is used to determine a prediction block for the block from the second plane using the corresponding block from the first plane. Pixels are selected from the block in the first plane to be used to form the prediction block and used as indices into the lookup table. The values at the indices of the lookup table can then be inserted into a block at the spatially coextensive positions corresponding to the locations of the selected pixels to form the predicted block. More detail regarding determination of the prediction block is given in relation to
Since the pixels of the block used to determine the prediction block can assume values other than the values of pixels used to populate the lookup table, it is possible that a pixel value can correspond to an index of the table associated with an empty location. For example, in some implementations, 8-bit pixels can have pixel values between 0 and 255. In the case of 8×8 blocks, when a lookup table is determined using one row of pixels peripheral to the left and top edges of a block, for example, a maximum of 15 pixel values would be used as indices into the table. If the current index points to an empty entry, the lookup table can be searched to find the first non-empty entry at an index greater than the current index and the first non-empty entry at an index less than the current index and an entry can be created by linear interpolation between the two values. In this way a prediction block for a block in a second plane can be determined from a block in a first plane.
At step 612 the determined prediction block is subtracted from the block to be predicted. Prediction blocks determined in this fashion can be good predictors of a block since data in planes corresponding to components of data such as YUV representations can be highly correlated as long as differences in gain and offset are compensated for. Subtracting the prediction block from a block to be predicted results in a residual block. Good prediction results in a residual block including small pixel values having low spatial frequency and can be represented by a small number of bits following further encoding. At step 614 the residual block can be further encoded by transforming the residue as described above in relation to stage 474, quantizing the transformed residual block for example by stage 476 and entropy encoding (stage 478).
For simplicity of explanation, implementations of operation 600 are depicted and described as a series of steps. However, steps in accordance with this disclosure can occur in various orders and/or concurrently. For example, while in
For example, a transformation other that a lookup table can be utilized to determine a prediction block for the block from the second plane using the corresponding block from the first plane at step 610 and/or step 608 can be omitted. The transformation could, instead or in addition to a lookup table, include using linear regression, for example, to calculate a linear relationship between the pixels of the two planes and applying the linear relationship to the pixels of the first block to predict the pixels of the second block. Other ways to implement this transformation can include approximating the transform on a per-pixel basis based on minimizing the residual values following prediction, for example.
At step 702, a bitstream encoded according to disclosed implementations as described in relation to
At step 704 blocks are decoded by the decoder implementing operation 700. These blocks include a block from a first plane to be used to determine the prediction block and pixels from the first and second planes peripheral to the block to be predicted and the block to be used to form the predicted block to determine a lookup table. At step 706, a lookup table is determined as described in relation to
At step 708, intra/inter prediction stage 508 determines the prediction block using a decoded block from a first plane and the determined lookup table from step 706. At step 710 the block to be predicted, the residual block, is decoded by the decoder using stage 502, entropy decoding, stage 504 dequantization, stage 506 inverse transform and passed to stage 510, reconstruction, for example. At step 712, the prediction block is passed from intra/inter prediction stage 508 to reconstruction stage 510 to be added to the decoded residual block to form a reconstructed block.
For simplicity of explanation, implementations of operation 700 are depicted and described as a series of steps. However, steps in accordance with this disclosure can occur in various orders and/or concurrently. For example, with respect to
For example, a transformation other than a lookup table can be utilized to determine the prediction block using a decoded block from a first plane at step 708 and/or step 706 can be omitted. The transformation could, instead or in addition to a lookup table, include using linear regression, for example, to calculate a linear relationship between the pixels of the two planes and applying the linear relationship to the pixels of the first block to predict the pixels of the second block. Other ways to implement this transformation can include approximating the transform on a per-pixel basis based on minimizing the residual values following prediction, for example.
Pixels in row n−s through n−s+(i−1) and column n−1 through n+s*(j−1) plus the top left corner pixel n−s−1 are examples of pixels 804 peripheral to blocks in a plane to be used to form the lookup table to be used to form the prediction block. As described above, values of pixels of another plane corresponding to pixels 804 peripheral to a block 802 of a plane 800 are used as indices into an array where the values in the array can include the values of pixels 804 at the corresponding position peripheral to a block 802 in a plane 800 to be predicted. Pixels 804 can be selected from pixels peripheral to the top and left edge of a block 802, for example, in implementations where blocks are processed (e.g., encoded and/or decoded) in raster scan order starting at the top left corner of the plane. Pixels 804 will have already been encoded and decoded prior to them being used to form a prediction block.
To create a lookup table for two planes (e.g. trying to predict the U plane from the Y plane), a computing device can first iterate over each pixel/component in all the edge pixels of blocks that have previously been decoded, i.e. for which both the U and the Y pixel values are already known. These pixel values can be used to populate a count/value lookup table, to keep track of the pixel values that the U component had in each edge pixel for a given Y component value. We assume 8-bit values, i.e. a range of [0, 255]. The following pseudo-code illustrates this process:
The above pseudo-code accesses pixels peripheral to the top edge of a block of the Y plane and uses pixel values of the peripheral Y pixels as indices into the arrays val[ ] and cnt[ ] to add pixel values of spatially coextensive pixels from the U plane into the val[ ] array and increment the count in the cnt[ ] array. Additional pseudo-code (not shown) can similarly process pixels peripheral to the left edge and the top/left pixel of the block. Some implementations can use more than one row or column of pixels peripheral to the block (e.g., pixels not immediately adjacent to the block). For example, a top/left edge border of 2, 3, and/or 4 pixels thick can be used.
After the val[ ] and cnt[ ] arrays are populated, the value accumulator can be divided by the number of times each value occurred, such as shown by the following pseudo-code:
This final step provides a lookup table (val[ ]) based on an average relationship between U pixel values and Y pixel values based on pixels peripheral to the top, left and top/left pixels to the block.
The following pseudo-code illustrates using a block from the Y plane to predict a block from the U plane. This routine iterates over the current block's Y pixels, which have been previously encoded and decoded, and uses these to serve as a predictor for the U value, using the above generated lookup table as the prediction method. In this example, the Y and U planes have the same spatial sampling rate.
find_nearest_table_value( ) is a function that will look at the nearest value in the lookup-table for which at least one Y pixel of the peripheral pixels had the same value, (e.g., cnt[index] is non-zero), and returns the predicted U pixel value for that position.
Aspects of the disclosed implementations can find the nearest two values with a non-zero counter and form a weighted interpolation between the two values. More complex techniques for extracting results from a sparsely populated table, such as curve fitting can be used to calculate the value from the lookup table. An aspect of disclosed implementations can fill in the entire table by interpolating and possibly filtering values for all positions in the table. The table can be populated by using more than one adjacent row and column from the planes to obtain more values.
The inter-plane prediction technique uses correlation between two different planes in previously decoded value to predict the values of one such plane from the values of another plane in the current block as a means of more accurately predicting the signal, and thus decreasing the amount of residual coefficients that have to be coded in the bitstream, thus saving bandwidth and/or increasing video quality. Prediction isn't limited from or to a single or specific plane. The Y plane can be used to predict both U and V, or U to predict Y and V, or U and V together to predict Y, for example.
The words “example” or “exemplary” are used herein to mean serving as an example, instance, or illustration. Any aspect or design described herein as “example’ or “exemplary” is not necessarily to be construed as preferred or advantageous over other aspects or designs. Rather, use of the words “example” or “exemplary” is intended to present concepts in a concrete fashion. As used in this application, the term “or” is intended to mean an inclusive “or” rather than an exclusive “or”. That is, unless specified otherwise, or clear from context, “X includes A or B” is intended to mean any of the natural inclusive permutations. That is, if X includes A; X includes B; or X includes both A and B, then “X includes A or B” is satisfied under any of the foregoing instances. In addition, the articles “a” and “an” as used in this application and the appended claims should generally be construed to mean “one or more” unless specified otherwise or clear from context to be directed to a singular form. Moreover, use of the term “an embodiment” or “one embodiment” or “an implementation” or “one implementation” throughout is not intended to mean the same embodiment or implementation unless described as such.
The implementations of encoding and decoding described above illustrate some exemplary encoding and decoding techniques. However, it is to be understood that encoding and decoding, as those terms are used in the claims, could mean compression, decompression, transformation, or any other processing or change of data.
The implementations of the transmitting station 112 and/or the receiving station 130 (and the algorithms, methods, instructions, etc. stored thereon and/or executed thereby) can be realized in hardware, software, or any combination thereof. The hardware can include, for example, computers, intellectual property (IP) cores, application-specific integrated circuits (ASICs), programmable logic arrays, optical processors, programmable logic controllers, microcode, microcontrollers, servers, microprocessors, digital signal processors or any other suitable circuit. In the claims, the term “processor” should be understood as encompassing any of the foregoing hardware, either singly or in combination. The terms “signal” and “data” are used interchangeably. Further, portions of the transmitting station 112 and the receiving station 130 do not necessarily have to be implemented in the same manner.
Further, in one implementation, for example, the transmitting station 112 or the receiving station 130 can be implemented using a general purpose computer or general purpose processor with a computer program that, when executed, carries out any of the respective methods, algorithms and/or instructions described herein. In addition or alternatively, for example, a special purpose computer/processor can be utilized which can contain other hardware for carrying out any of the methods, algorithms, or instructions described herein.
The transmitting station 112 and receiving station 130 can, for example, be implemented on computers in a video transmission system. Alternatively, the transmitting station 112 can be implemented on a server and the receiving station 130 can be implemented on a device separate from the server, such as a hand-held communications device. In this instance, the transmitting station 112 can encode content using an encoder 470 into an encoded video signal and transmit the encoded video signal to the communications device. In turn, the communications device can then decode the encoded video signal using a decoder 500. Alternatively, the communications device can decode content stored locally on the communications device, for example, content that was not transmitted by the transmitting station 112. Other suitable transmitting station 112 and receiving station 130 implementation schemes are available. For example, the receiving station 130 can be a generally stationary personal computer rather than a portable communications device and/or a device including an encoder 470 may also include a decoder 500.
Further, all or a portion of implementations of the present invention can take the form of a computer program product accessible from, for example, a tangible computer-usable or computer-readable medium. A computer-usable or computer-readable medium can be any device that can, for example, tangibly contain, store, communicate, or transport the program for use by or in connection with any processor. The medium can be, for example, an electronic, magnetic, optical, electromagnetic, or a semiconductor device. Other suitable mediums are also available.
The above-described implementations have been described in order to allow easy understanding of the present invention and do not limit the present invention. On the contrary, the invention is intended to cover various modifications and equivalent arrangements included within the scope of the appended claims, which scope is to be accorded the broadest interpretation so as to encompass all such modifications and equivalent structure as is permitted under the law.
Number | Name | Date | Kind |
---|---|---|---|
5150209 | Baker et al. | Sep 1992 | A |
5708473 | Mead | Jan 1998 | A |
5767987 | Wolff et al. | Jun 1998 | A |
5787192 | Takaichi | Jul 1998 | A |
5916449 | Ellwart et al. | Jun 1999 | A |
5930387 | Chan et al. | Jul 1999 | A |
5956467 | Rabbani et al. | Sep 1999 | A |
6005625 | Yokoyama | Dec 1999 | A |
6044166 | Bassman et al. | Mar 2000 | A |
6058211 | Bormans et al. | May 2000 | A |
6208765 | Bergen | Mar 2001 | B1 |
6285804 | Crinon et al. | Sep 2001 | B1 |
6292837 | Miller et al. | Sep 2001 | B1 |
6314208 | Konstantinides et al. | Nov 2001 | B1 |
6349154 | Kleihorst | Feb 2002 | B1 |
6473460 | Topper | Oct 2002 | B1 |
6611620 | Kobayashi et al. | Aug 2003 | B1 |
6628845 | Stone et al. | Sep 2003 | B1 |
6650704 | Carlson et al. | Nov 2003 | B1 |
6654419 | Sriram et al. | Nov 2003 | B1 |
6785425 | Feder et al. | Aug 2004 | B1 |
6798901 | Acharya et al. | Sep 2004 | B1 |
6907079 | Gomila et al. | Jun 2005 | B2 |
7106910 | Acharya et al. | Sep 2006 | B2 |
7116830 | Srinivasan | Oct 2006 | B2 |
7158681 | Persiantsev | Jan 2007 | B2 |
7197070 | Zhang et al. | Mar 2007 | B1 |
7218674 | Kuo | May 2007 | B2 |
7263125 | Lainema | Aug 2007 | B2 |
7333544 | Kim et al. | Feb 2008 | B2 |
7466774 | Boyce | Dec 2008 | B2 |
7602851 | Lee et al. | Oct 2009 | B2 |
7602997 | Young | Oct 2009 | B2 |
7689051 | Mukerjee | Mar 2010 | B2 |
7924918 | Lelescu et al. | Apr 2011 | B2 |
8094722 | Wang | Jan 2012 | B2 |
8111914 | Lee et al. | Feb 2012 | B2 |
8135064 | Tasaka et al. | Mar 2012 | B2 |
8320470 | Huang et al. | Nov 2012 | B2 |
8369402 | Kobayashi et al. | Feb 2013 | B2 |
8559512 | Paz | Oct 2013 | B2 |
8885956 | Sato | Nov 2014 | B2 |
9167268 | Gu et al. | Oct 2015 | B1 |
9247251 | Bultje | Jan 2016 | B1 |
20020017565 | Ju et al. | Feb 2002 | A1 |
20020026639 | Haneda | Feb 2002 | A1 |
20020071485 | Caglar et al. | Jun 2002 | A1 |
20030202705 | Sun | Oct 2003 | A1 |
20030215018 | MacInnis et al. | Nov 2003 | A1 |
20030215135 | Caron et al. | Nov 2003 | A1 |
20030234795 | Lee | Dec 2003 | A1 |
20040001634 | Mehrotra | Jan 2004 | A1 |
20040101045 | Yu et al. | May 2004 | A1 |
20040252886 | Pan et al. | Dec 2004 | A1 |
20050068208 | Liang et al. | Mar 2005 | A1 |
20050078754 | Liang et al. | Apr 2005 | A1 |
20050123207 | Marpe et al. | Jun 2005 | A1 |
20050180500 | Chiang et al. | Aug 2005 | A1 |
20060056689 | Wittebrood et al. | Mar 2006 | A1 |
20060078754 | Murakami et al. | Apr 2006 | A1 |
20060164543 | Richardson et al. | Jul 2006 | A1 |
20060203916 | Chandramouly et al. | Sep 2006 | A1 |
20060215751 | Reichel et al. | Sep 2006 | A1 |
20070025441 | Ugur et al. | Feb 2007 | A1 |
20070036354 | Wee et al. | Feb 2007 | A1 |
20070076964 | Song | Apr 2007 | A1 |
20070080971 | Sung | Apr 2007 | A1 |
20070121100 | Divo | May 2007 | A1 |
20070177673 | Yang | Aug 2007 | A1 |
20070216777 | Quan et al. | Sep 2007 | A1 |
20070217701 | Liu et al. | Sep 2007 | A1 |
20080069440 | Forutanpour | Mar 2008 | A1 |
20080123750 | Bronstein et al. | May 2008 | A1 |
20080170615 | Sekiguchi et al. | Jul 2008 | A1 |
20080212678 | Booth et al. | Sep 2008 | A1 |
20080239354 | Usui | Oct 2008 | A1 |
20080260042 | Shah et al. | Oct 2008 | A1 |
20080294962 | Goel | Nov 2008 | A1 |
20080310745 | Ye et al. | Dec 2008 | A1 |
20090041119 | Thoreau et al. | Feb 2009 | A1 |
20090161763 | Rossignol et al. | Jun 2009 | A1 |
20090190659 | Lee et al. | Jul 2009 | A1 |
20090232401 | Yamashita et al. | Sep 2009 | A1 |
20090257492 | Andersson et al. | Oct 2009 | A1 |
20100021009 | Yao | Jan 2010 | A1 |
20100023979 | Patel et al. | Jan 2010 | A1 |
20100034265 | Kim et al. | Feb 2010 | A1 |
20100034268 | Kusakabe et al. | Feb 2010 | A1 |
20100086028 | Tanizawa et al. | Apr 2010 | A1 |
20100104021 | Schmit | Apr 2010 | A1 |
20100111182 | Karczewicz et al. | May 2010 | A1 |
20100118943 | Shiodera et al. | May 2010 | A1 |
20100118945 | Wada et al. | May 2010 | A1 |
20100195715 | Liu et al. | Aug 2010 | A1 |
20100266008 | Reznik | Oct 2010 | A1 |
20100312811 | Reznik | Dec 2010 | A1 |
20100329341 | Kam et al. | Dec 2010 | A1 |
20110002541 | Varekamp | Jan 2011 | A1 |
20110026591 | Bauza et al. | Feb 2011 | A1 |
20110033125 | Shiraishi | Feb 2011 | A1 |
20110069890 | Besley | Mar 2011 | A1 |
20110158529 | Malik | Jun 2011 | A1 |
20110170595 | Shi et al. | Jul 2011 | A1 |
20110170596 | Shi et al. | Jul 2011 | A1 |
20110170597 | Shi et al. | Jul 2011 | A1 |
20110170608 | Shi et al. | Jul 2011 | A1 |
20110206135 | Drugeon et al. | Aug 2011 | A1 |
20110206289 | Dikbas et al. | Aug 2011 | A1 |
20110211757 | Kim et al. | Sep 2011 | A1 |
20110216834 | Zhou | Sep 2011 | A1 |
20110235706 | Demircin et al. | Sep 2011 | A1 |
20110243225 | Min et al. | Oct 2011 | A1 |
20110243229 | Kim et al. | Oct 2011 | A1 |
20110243230 | Liu | Oct 2011 | A1 |
20110249741 | Zhao et al. | Oct 2011 | A1 |
20110255592 | Sung et al. | Oct 2011 | A1 |
20110268359 | Steinberg et al. | Nov 2011 | A1 |
20110293001 | Lim et al. | Dec 2011 | A1 |
20120014439 | Segall et al. | Jan 2012 | A1 |
20120014444 | Min et al. | Jan 2012 | A1 |
20120020408 | Chen et al. | Jan 2012 | A1 |
20120039384 | Reznik | Feb 2012 | A1 |
20120039388 | Kim et al. | Feb 2012 | A1 |
20120063691 | Yu et al. | Mar 2012 | A1 |
20120082220 | Mazurenko et al. | Apr 2012 | A1 |
20120177108 | Joshi et al. | Jul 2012 | A1 |
20120278433 | Liu et al. | Nov 2012 | A1 |
20120287986 | Paniconi et al. | Nov 2012 | A1 |
20120287998 | Sato | Nov 2012 | A1 |
20120300837 | Wilkins et al. | Nov 2012 | A1 |
20120307884 | MacInnis | Dec 2012 | A1 |
20120314942 | Williams et al. | Dec 2012 | A1 |
20120320975 | Kim et al. | Dec 2012 | A1 |
20130027230 | Marpe et al. | Jan 2013 | A1 |
20130121415 | Wahadaniah et al. | May 2013 | A1 |
20140044166 | Xu et al. | Feb 2014 | A1 |
20160037174 | Gu et al. | Feb 2016 | A1 |
Number | Date | Country |
---|---|---|
1903698 | Mar 2008 | EP |
2007267414 | Oct 2007 | JP |
Entry |
---|
Bankoski et al. “Technical Overview of VP8, an Open Source Video Codec for the Web”. Dated Jul. 11, 2011. |
Bankoski et al. “VP8 Data Format and Decoding Guide” Independent Submission. RFC 6389, Dated Nov. 2011. |
Bankoski et al. “VP8 Data Format and Decoding Guide; draft-bankoski-vp8-bitstream-02” Network Working Group. Internet-Draft, May 18, 2011, 288 pp. |
Implementors' Guide; Series H: Audiovisual and Multimedia Systems; Coding of moving video: Implementors Guide for H.264: Advanced video coding for generic audiovisual services. H.264. International Union. Version 12. Dated Jul. 30, 2010. |
Mozilla, “Introduction to Video Coding Part 1: Transform Coding”, Video Compression Overview, Mar. 2012, 171 pp. |
Overview; VP7 Data Format and Decoder. Version 1.5. On2 Technologies, Inc. Dated Mar. 28, 2005. |
Park, Jun Sung, et al., “Selective Intra Prediction Mode Decision for H.264/AVC Encoders”, World Academy of Science, Engineering and Technology 13, (2006). |
Series H: Audiovisual and Multimedia Systems; Infrastructure of audiovisual services—Coding of moving video. H.264. Advanced video coding for generic audiovisual services. International Telecommunication Union. Verion 11. Dated Mar. 2009. |
Series H: Audiovisual and Multimedia Systems; Infrastructure of audiovisual services—Coding of moving video. H.264. Advanced video coding for generic audiovisual services. International Telecommunication Union. Version 12. Dated Mar. 2010. |
Series H: Audiovisual and Multimedia Systems; Infrastructure of audiovisual services—Coding of moving video. H.264. Amendment 2: New profiles for professional applications. International Telecommunication Union. Dated Apr. 2007. |
Series H: Audiovisual and Multimedia Systems; Infrastructure of audiovisual services—Coding of moving video. H.264. Advanced video coding for generic audiovisual services. Version 8. International Telecommunication Union. Dated Nov. 1, 2007. |
Series H: Audiovisual and Multimedia Systems; Infrastructure of audiovisual services—Coding of moving video; Advanced video coding for generic audiovisual services. H.264. Amendment 1: Support of additional colour spaces and removal of the High 4:4:4 Profile. International Telecommunication Union. Dated Jun. 2006. |
Series H: Audiovisual and Multimedia Systems; Infrastructure of audiovisual services—Coding of moving video; Advanced video coding for generic audiovisual services. H.264. Version 1. International Telecommunication Union. Dated May 2003. |
Series H: Audiovisual and Multimedia Systems; Infrastructure of audiovisual services—Coding of moving video; Advanced video coding for generic audiovisual services. H.264. Version 3. International Telecommunication Union. Dated Mar. 2005. |
VP6 Bitstream & Decoder Specification. Version 1.02. On2 Technologies, Inc. Dated Aug. 17, 2006. |
VP6 Bitstream & Decoder Specification. Version 1.03. On2 Technologies, Inc. Dated Oct. 29, 2007. |
VP8 Data Format and Decoding Guide. WebM Project. Google On2. Dated: Dec. 1, 2010. |
Su M—T Sun University of Washington et al. “Encoder Optimization for H.264/AVC Fidelity Range Extensions” Jul. 12, 2005. |
Pan et al., “Fast mode decision algorithms for inter/intra prediction in H.264 video coding.” Advances in Multimedia Information Processing PCM 2007. Springer Berlin Heidelberg, 2007. pp. 158-167. |
Kim et al., “Fast H.264 intra-prediction mode selection using joint spatial and transform domain features.” Journal of Visual Communication and Image Representation 17.2, 2006, pp. 291-310. |