This application claims priority from Korean Patent Application No. 10-2005-0031114 filed on Apr. 14, 2005 in the Korean Intellectual Property Office, the disclosure of which is incorporated herein by reference in its entirety.
1. Field of the Invention
The present invention relates to a video encoding/decoding apparatus and method capable of minimizing a delay in random access, and more particularly, to a video encoding/decoding apparatus and method capable of minimizing a delay in random access, in which the amount of time taken to display a new frame after a channel switch can be reduced when receiving a video streaming service or reproducing a compressed moving image.
2. Description of the Related Art
Three operations are used in current video compression standards such as MPEG2, MPEG4, H.263, and H264 in order to enhance data compression efficiency.
First, red, green, and blue (RGB) components of an input color image or a luminance component Y along with two color difference components Cb and Cr are converted into YCbCr data.
Second, spatial redundancy is eliminated from a single picture through discrete cosine transformation (DCT), quantization (Q), or variable length coding (VLC).
Third, temporal redundancy of a plurality of consecutive frames is eliminated based on the assumption that parts of a plurality of temporally consecutive frames are likely to be redundant. The elimination of temporal redundancy of a plurality of consecutive frames may be carried out using a prediction method, such as differential pulse code modulation (DPCM), based on a motion vector obtained from motion estimation.
Image data can be encoded as two separate bitstreams by using two encoding methods. One method is a base layer encoding method in which the image data is down-sampled to one fourth or one sixteenth of its original size and the result of the down-sampling operation is encoded, and the other method is an enhancement layer encoding method in which the image data is encoded by using differences between the image data and image data restored from a base layer bitstream without the need to down-sample the image data.
In order to generate an enhancement layer bitstream, inverse quantization (IQ) and inverse DCT (IDCT) are performed on image data that has been quantized at a base layer, thereby restoring image data to the same size as the original image data. Thereafter, differences between the restored image data and the original image data are calculated. Then, the differences are added to the original image data, and DCT, Q, and VLC are performed on the addition result in the same order as in a base layer encoding method, thereby obtaining an enhancement layer bitstream.
An enhancement layer bitstream is decoded basically in the same manner as a base layer bitstream. Image data restored from a base-layer-encoded-bitstream is up-sampled. Thereafter, image data obtained by performing VLD, IQ, and IDCT on an enhancement layer level is added to the up-sampling result, thereby restoring the original image data. The restoration result may not be the same as the original image data. Image data decoded from an enhancement layer bitstream generally has a higher picture quality than image data decoded from a base layer bitstream.
In a single layer encoding method and a spatial layer encoding method, image data is encoded so that the encoded result begins with an I frame followed by a plurality of P and B frames, thereby reducing the bit rate. If the encoded result consists only of P and B frames, it might not be possible to fully restore the image data when an error occurs therein. In addition, if the encoded result consists only of P and B frames, decoding might not be possible during random access. Therefore, more than one I frame is inserted into the encoded result, and this process is referred to as intra refresh. An intra refresh operation is performed every fifteen frames of the encoded result. A random access delay of up to 0.5 seconds may be created when encoding a moving image with a frame rate of thirty frames per second using an intra refresh method. This random access delay may also be created when broadcasting the moving image or when storing the moving image in a storage device and reproducing the moving image from the storage device.
Referring to
The present invention provides a video encoding/decoding apparatus and method by which random access delay of a moving image service can be minimized and the bit rate of a bitstream obtained from spatial layer encoding can become regular by setting the I-frame interval of a base layer shorter than the I-frame interval of an enhancement layer.
An aspect of the present invention provides a video encoding apparatus capable of minimizing a random access delay, the video encoding apparatus including an encoding control unit which may set an intra frame (I-frame) interval of a base layer shorter than an I-frame interval of an enhancement layer, a base layer encoding unit which may generate a base layer bitstream by reducing and encoding an original image according to the I-frame intervals set by the encoding control unit, and an enhancement layer encoding unit which may generate an enhancement layer bitstream by decoding an enhancement layer image which is not temporally aligned with the base layer bitstream and referring to a predetermined image obtained by decoding the base layer bitstream and enlarging the decoded result. The video encoding apparatus may further include a transmission unit which may multiplex the base layer bitstream and the enhancement layer bitstream according to the I-frame intervals set by the encoding control unit or give different priority levels to the base layer bitstream and the enhancement layer bitstream and transmits the base layer bitstream and the enhancement layer bitstream according to the priority levels of the base layer bitstream and the enhancement layer bitstream.
Another aspect of the present invention provides a video decoding apparatus capable of minimizing a random access delay including a first base layer decoding unit which may decode a base layer bitstream and enlarge the decoded base layer bitstream to the size of a corresponding original image, an enhancement layer decoding unit which may decode an enhancement layer image which is temporally different from the base layer bitstream by referring to the enlarged result, and a decoding control unit which may control the enlarged result to be reproduced until an I frame of the decoded enhancement layer image is reproduced and control the decoded enhancement layer image to be displayed when the I frame of the decoded enhancement layer image is reproduced. The video decoding apparatus may further include a second base layer decoding unit which may decode a base layer image of a channel other than the channel of the base layer bitstream decoded by the first base layer decoding unit while the first base layer decoding unit decodes the base layer bitstream so that the base layer image decoded by the second base layer decoding unit is displayed within the base layer bitstream decoded by the first base layer decoding unit.
Another aspect of the present invention provides a video encoding method capable of minimizing a random access delay including setting an I-frame interval of a base layer shorter than an I-frame interval of an enhancement layer, generating a base layer bitstream by reducing and encoding an original image according to the I-frame intervals of the base layer and the enhancement layer, and generating an enhancement layer bitstream by decoding an enhancement layer image which is temporally different from the base layer bitstream and referring to a predetermined image obtained by decoding the base layer bitstream and enlarging the decoded result. Preferably, the video encoding method further includes transmitting the base layer bitstream and the enhancement layer bitstream to a decoder side by multiplexing the same the base layer bitstream and the enhancement layer bitstream according to the set I-frame intervals or giving different priority levels thereto.
According to yet another aspect of the present invention, there is provided a video decoding method capable of minimizing a random access delay including decoding a base layer bitstream and enlarging the decoded base layer bitstream to the size of a corresponding original image, decoding an enhancement layer image which is temporally different from the base layer bitstream by referring to the enlarged result, and controlling the enlarged result to be reproduced until an I frame of the decoded enhancement layer image is reproduced and controlling the decoded enhancement layer image to be displayed when the I frame of the decoded enhancement layer image is reproduced. Preferably, the video decoding method further includes decoding a base layer image of a channel other than the current channel of the base layer bitstream so that the base layer image is displayed within the base layer bitstream.
The above and other features and advantages of the present invention will become more apparent by describing in detail exemplary embodiments thereof with reference to the attached drawings in which:
The present invention will now be described more fully with reference to the accompanying drawings, in which exemplary embodiments of the invention are shown.
A video encoding method according to an exemplary embodiment of the present invention is based on the principles of the conventional spatial layer encoding method described above with reference to
Referring to
The encoding control unit 540 sets the I-frame intervals of the base layer and the enhancement layer so that an I frame of the base layer and a corresponding I frame of the enhancement layer are temporally different. In general, a bit rate ratio among I, P, and B frames is about 8:3:2. Thus, if I frames of the base layer and the enhancement layer are located on the same time axis, a bit rate at the time axis where the I frames coexist may become excessively high. Thus, the bit rate for 1 frames may be much higher than the bit rate for P or B frames. However, in exemplary embodiments of the present invention, the I-frame intervals of the base layer and the enhancement layer are set so that an I frame of the base layer and a corresponding I frame of the enhancement layer are temporally different.
In operation S920, a base layer encoding unit 510 may reduce an original image according to the I-frame intervals set by the encoding control unit 540, thereby generating a base layer bitstream. The base layer encoding unit 510 may arbitrarily set the reduce rate for the original image. For convenience of calculation or for simplification of structure, the base layer encoding unit 510 may set the reduced ratio for the original image to 2:1, 4:1 or 8:1.
In operation S930, an enhancement layer encoding unit 520 may generate an enhancement layer bitstream by referring to a predetermined enlarged image obtained by decoding the base layer bitstream, and an enhancement layer image which is at a temporal position different from the current enhancement layer to be coded. Here, the enhancement layer image which is temporally different from the current enhancement layer image to be currently coded implies one obtained after encoding an image that is temporally different from the enhancement layer image to be currently encoded and decoding the image. In general, instead of using an open-loop scheme, a closed-loop scheme may be used. That is, a decode frame may be used as a reference frame. Referring to a temporally different image means motion compensated temporal prediction. Referring to an enlarged image after decoding the bitstream of a base layer (BL) implies intra BL prediction is performed.
In operation S940, a transmission unit 530 may multiplex the base layer bitstream and the enhancement layer bitstream according to the I-frame intervals set by the encoding control unit 540 or allocate different priority levels to the base layer bitstream and the enhancement layer and then transmit the base layer bitstream and the enhancement layer to a video decoding apparatus according to an exemplary embodiment of the present invention according to the priority levels of the base layer bitstream and the enhancement layer.
Referring to
In operation S1020, the enhancement layer decoding unit 630, which has received the enlarged result from the first base layer decoding unit 610, may decode a current enhancement layer image by referring to the enlarged result and an enhancement layer image which is temporally different from the base layer bitstream.
In operation S1030, a decoding control unit 640 may control the first base layer decoding unit 610 to enlarge the decoded base layer image, display the enlarged result, and abandon an enhancement layer bitstream until an I frame of the decoded enhancement layer image is reproduced. In addition, in operation S1030, the decoding control unit 640 may control a frame display unit 650 to display the decoded enhancement layer image as soon as the reproduction of the I frame of the decoded enhancement layer image begins. Moreover, if data loss occurs in the enhancement layer bitstream, the decoding control unit 640 may control the data loss to be concealed using information from an enhancement layer frame which is not temporally aligned with the enhancement layer bitstream or information regarding the enlarged result obtained by the first base layer decoding unit 610. In this case, since a base layer bitstream is given a higher priority level than an enhancement layer bitstream and is thus transmitted prior to the transmission of the enhancement layer bitstream, data loss is less likely to occur in the base layer bitstream than in the enhancement layer bitstream. Therefore, simple image data with large movement is encoded as a base layer bitstream, and complicated image data with small movement is encoded as an enhancement layer bitstream.
In operation S1040, while the first base layer decoding unit 610 decodes the base layer bitstream, a second base layer decoding unit 620 may decode a base layer image of a channel other than the channel of the base layer bitstream decoded by the first base layer decoding unit 610 in order to realize Picture in Picture (PIP) in which an image is inserted into an image currently being displayed. Thereafter, the second base layer decoding unit 620 may transmit the decoded base layer image to the frame display unit 650. In PIP, there is no restriction regarding the number of images that can be simultaneously displayed, a main image displayed on an entire frame is obtained by decoding both a corresponding base layer bitstream and a corresponding enhancement layer bitstream, and a minor image displayed within the main image is obtained by decoding only a corresponding base layer bitstream.
Referring to
According to exemplary embodiments of the present invention, it is possible to minimize an increase in bit rate in random access and hence minimize an increase in random access delay time by setting the I-frame interval of a base layer shorter than the I-frame interval of an enhancement layer.
Accordingly, it is possible to prevent bit rate from becoming excessively high for I frames and thus achieve a uniform bit rate by setting the I-frame intervals of a base layer and an enhancement layer so that an I frame of an enhancement layer and a corresponding I frame of a base layer temporally different. In addition, it is possible to conveniently realize Picture in Picture (PIP) by reducing the complexity of a PIP frame by ¼ or more.
Moreover, when the bit rate considerably varies as in a wireless network or the Internet, only a base layer bitstream can be transmitted in consideration of the circumstances in a network.
Number | Date | Country | Kind |
---|---|---|---|
10-2005-0031114 | Apr 2005 | KR | national |