1. Field of the Invention
The present invention relates to scalable encoding and decoding of a video signal.
2. Description of the Related Art
It is difficult to allocate high bandwidth, required for TV signals, to digital video signals wirelessly transmitted and received by mobile phones and notebook computers. It is expected that similar difficulties will occur with mobile TVs and handheld PCs, which will come into widespread use in the future. Thus, video compression standards for use with mobile devices should have high video signal compression efficiencies.
Such mobile devices have a variety of processing and presentation capabilities so that a variety of compressed video data forms should be prepared. This means that a variety of different quality video data with different combinations of a number of variables such as the number of frames transmitted per second, resolution, and the number of bits per pixel should be provided based on a single video source. This imposes a great burden on content providers.
Because of the above, content providers prepare high-bitrate compressed video data for each source video and perform, when receiving a request from a mobile device, a process of decoding compressed video and encoding it back into video data suited to the video processing capabilities of the mobile device. However, this method entails a transcoding procedure including decoding, scaling, and encoding processes, which causes some time delay in providing the requested data to the mobile device. The transcoding procedure also requires complex hardware and algorithms to cope with the wide variety of target encoding formats.
The Scalable Video Codec (SVC) has been developed in an attempt to overcome these problems. This scheme encodes video into a sequence of pictures with the highest image quality while ensuring that part of the encoded picture (frame) sequence (specifically, a partial sequence of frames intermittently selected from the total sequence of frames) can be decoded to produce a certain level of image quality.
Motion Compensated Temporal Filtering (MCTF) is an encoding scheme that has been suggested for use in the Scalable Video Codec. The MCTF scheme has a high compression efficiency (i.e., a high coding efficiency) for reducing the number of bits transmitted per second. The MCTF scheme is likely to be applied to transmission environments such as a mobile communication environment where bandwidth is limited.
Although it is ensured that part of a sequence of pictures encoded in the scalable MCTF coding scheme can be received and processed to video with a certain level of image quality as described above, there is still a problem in that the image quality is significantly reduced if the bitrate is lowered. One solution to this problem is to provide an auxiliary picture sequence for low bitrates, for example, a sequence of pictures that have a small screen size and/or a low frame rate. One example is to encode and transmit not only a main picture sequence of 4CIF (Common Intermediate Format) but also an auxiliary picture sequence of CIF and an auxiliary picture sequence of QCIF (Quarter CIF) to decoders. Each sequence is referred to as a layer, and the higher of two given layers is referred to as an enhanced layer and the lower is referred to as a base layer.
More often, the auxiliary picture sequence is referred to as a base layer (BL), and the main picture sequence is referred to as an enhanced or enhancement layer. Video signals of the base and enhanced layers have redundancy since the same video content is encoded into two layers with different spatial resolution or different frame rates. To increase the coding efficiency of the enhanced layer, a video signal of the enhanced layer may be predicted using motion information and/or texture information of the base layer. This prediction method is referred to as inter-layer prediction.
The intra BL prediction method uses a texture (or image data) of the base layer. Specifically, the intra BL prediction method produces predictive data of a macroblock of the enhanced layer using a corresponding block of the base layer encoded in an intra mode. The term “corresponding block” refers to a block which is located in a base layer frame temporally coincident with a frame including the macroblock and which would have an area covering the macroblock if the base layer frame were enlarged by the ratio of the screen size of the enhanced layer to the screen size of the base layer. The intra BL prediction method uses the corresponding block of the base layer after enlarging the corresponding block by the ratio of the screen size of the enhanced layer to the screen size of the base layer through upsampling.
The inter-layer residual prediction method is similar to the intra BL prediction method except that it uses a corresponding block of the base layer encoded so as to contain residual data, which is data of an image difference, rather than a corresponding block of the base layer containing image data. The inter-layer residual prediction method produces predictive data of a macroblock of the enhanced layer encoded so as to contain residual data, which is data of an image difference, using a corresponding block of the base layer encoded so as to contain residual data. Similar to the intra BL prediction method, the inter-layer residual prediction method uses the corresponding block of the base layer containing residual data after enlarging the corresponding block by the ratio of the screen size of the enhanced layer to the screen size of the base layer through upsampling.
A base layer with lower resolution for use in the inter-layer prediction method is produced by downsampling a video source. Corresponding pictures (frames or blocks) in enhanced and base layers produced from the same video source may be out of phase since a variety of different downsampling techniques and downsampling ratios (i.e., horizontal and/or vertical size reduction ratios) may be employed.
A video signal is managed as separate components, namely, a luma component and two chroma components. The luma component is associated with luminance information Y and the two chroma components are associated with chrominance information Cb and Cr. A ratio of 4:2:0 (Y:Cb:Cr) between luma and chroma signals is widely used. Samples of the chroma signal are typically located midway between samples of the luma signal. When an enhanced layer and/or a base layer are produced directly from a video source, luma and chroma signals of the enhanced layer and/or the base layer are sampled so as to satisfy the 4:2:0 ratio and a position condition according to the 4:2:0 ratio.
In the above case (i), the enhanced and base layers may be out of phase as shown in section (a) of
In the above case (ii), the base layer is produced by downsampling luma and chroma signals of the enhanced layer by a specific ratio. If the base layer is produced such that luma and chroma signals of the base layer are in phase with luma and chroma signals of the enhanced layer, the luma and chroma signals of the base layer do not satisfy a position condition according to the 4:2:0 ratio as illustrated in section (b) of
In addition, if the base layer is produced such that luma and chroma signals of the base layer satisfy a position condition according to the 4:2:0 ratio, the chroma signal of the base layer is out of phase with the chroma signal of the enhanced layer as illustrated in section (c) of
Also in case (ii), the enhanced and base layers may be out of phase as illustrated in section (a).
That is, the phase of the base layer may be changed in the downsampling procedure for producing the base layer and in the upsampling procedure of the inter-layer prediction method, so that the base layer is out of phase with the enhanced layer, thereby reducing coding efficiency.
Also, video frames in sequences of different layers may have different aspect ratios. For example, video frames of the higher sequence (i.e., the enhanced layer) may have a wide aspect ratio of 16:9, whereas video frames of the lower sequence (i.e., the base layer) may have a narrow aspect ratio of 4:3. In this case, there maybe a need to determine which part of a base layer picture is to be used for an enhanced layer picture or for which part of the enhanced layer picture the base layer picture is to be used when performing prediction of the enhanced layer picture.
The present invention relates to a method for decoding a video signal. In one embodiment, the method includes predicting at least a portion of a current image in a current layer based on at least a residual coded portion of a base image in a base layer, a reference image, shift information for samples in the predicted current image, and offset information indicating a position offset between at least one boundary pixel of the reference image and at least one boundary pixel of the current image. The residual coded portion represents difference pixel data.
In one embodiment, the reference image is at least an up-sampled residual coded portion of the base image. For example, in one embodiment, the method may include upsampling at least the residual coded portion of the base image based on the shift information to obtain the reference image. The shift information may be phase shift information.
In one embodiment, the offset information may include left offset information indicating a position offset between at least one left side pixel of the reference image and at least one left side pixel of the current image, top offset information indicating a position offset between at least one top side pixel of the reference image and at least one top side pixel of the current image, right offset information indicating a right position offset between at least one right side pixel of the reference image and at least one right side pixel of the current image, and/or bottom offset information indicating a bottom position offset between at least one bottom side pixel of the reference image and at least one bottom side pixel of the current image.
In a further embodiment, the predicting step predicts the portion of the current image based on at least part of the reference image, the offset information and dimension information. The dimension information indicates at least one dimension of the current image. For example, the dimension information may include width information indicating a width of the current image, and/or height information indicating a height of the current image.
The present invention further relates to methods of encoding a video signal, an apparatus for decoding a video signal, and/or an apparatus for encoding a video signal.
The above and other objects, features and other advantages of the present invention will be more clearly understood from the following detailed description taken in conjunction with the accompanying drawings, in which:
a is a block diagram of an embodiment of a video signal encoding apparatus to which a scalable video signal coding method according to the present invention is applied;
b is a block diagram of another embodiment of a video signal encoding apparatus to which a scalable video signal coding method according to the present invention is applied;
a and 3b illustrate the relationship between enhanced layer frames and base layer frames which can be used as reference frames for converting an enhanced layer frame to an H frame having a predictive image;
a and 5b illustrate embodiments of the structure of information regarding a positional relationship of a base layer picture to an enhanced layer picture, which is transmitted to the decoder, according to the present invention;
Example embodiments of the present invention will now be described in detail with reference to the accompanying drawings.
a is a block diagram of a video signal encoding apparatus to which a scalable video signal coding method according to the present invention is applied. Although the apparatus of
The video signal encoding apparatus shown in
The base layer encoder 150 can provide a low-bitrate data stream not only by encoding an input video signal into a sequence of pictures having a smaller screen size than pictures of the enhanced layer, but also by encoding an input video signal into a sequence of pictures having the same screen size as pictures of the enhanced layer at a lower frame rate than the enhanced layer. In the embodiments of the present invention described below, the base layer is encoded into a small-screen picture sequence, and the small-screen picture sequence is referred to as a base layer sequence and the frame sequence output from the EL encoder 100 is referred to as an enhanced layer sequence.
b illustrates a block diagram of another embodiment video signal encoding apparatus to which a scalable video signal coding method according to the present invention is applied. This embodiment is the same as the embodiment of
In each embodiment, the EL encoder 100 performs motion estimation and prediction operations on each target macroblock in a video frame. The EL encoder 100 also performs an update operation for each target macroblock by adding an image difference of the target macroblock from a corresponding macroblock in a neighbor frame to the corresponding macroblock in the neighbor frame.
The elements of the EL encoder 100 shown in
The estimator/predictor 102 and the updater 103 of
The estimator/predictor 102 divides each of the input video frames (or L frames obtained at the previous level) into macroblocks of a desired size. For each divided macroblock, the estimator/predictor 102 searches for a block, whose image is most similar to that of each divided macroblock, in previous/next neighbor frames of the enhanced layer and/or in base layer frames enlarged by the scaler 105a. That is, the estimator/predictor 102 searches for a macroblock temporally correlated with each divided macroblock. A block having the most similar image to a target image block has the smallest image difference from the target image block. The image difference of two image blocks is defined, for example, as the sum or average of pixel-to-pixel differences of the two image blocks. Of blocks having a threshold image difference or less from a target macroblock in the current frame, a block having the smallest image difference from the target macroblock is referred to as a reference block. A picture including the reference block is referred to as a reference picture. For each macroblock of the current frame, two reference blocks (or two reference pictures) may be present in a frame (including a base layer frame) prior to the current frame, in a frame (including a base layer frame) subsequent thereto, or one in a prior frame and one in a subsequent frame.
If the reference block is found, the estimator/predictor 102 calculates and outputs a motion vector from the current block to the reference block. The estimator/predictor 102 also calculates and outputs pixel error values (i.e., pixel difference values) of the current block from pixel values of the reference block, which is present in either the prior frame or the subsequent frame, or from average pixel values of the two reference blocks, which are present in the prior and subsequent frames. The image or pixel difference values are also referred to as residual data.
If no macroblock having a desired threshold image difference or less from the current macroblock is found in the two neighbor frames (including base layer frames) via the motion estimation operation, the estimator/predictor 102 determines whether or not a frame in the same time zone as the current frame (hereinafter also referred to as a “temporally coincident frame”) or a frame in a close time zone to the current frame (hereinafter also referred to as a “temporally close frame”) is present in the base layer sequence. If such a frame is present in the base layer sequence, the estimator/predictor 102 obtains the image difference (i.e., residual data) of the current macroblock from a corresponding macroblock in the temporally coincident or close frame based on pixel values of the two macroblocks, and does not obtain a motion vector of the current macroblock with respect to the corresponding macroblock. The close time zone to the current frame corresponds to a time interval including frames that can be regarded as having the same image as the current frame. Information of this time interval is carried within an encoded stream.
The above operation of the estimator/predictor 102 is referred to as a ‘P’ operation. When the estimator/predictor 102 performs the ‘P’ operation to produce an H frame by searching for a reference block of each macroblock in the current frame and coding each macroblock into residual data, the estimator/predictor 102 can selectively use, as reference pictures, enlarged pictures of the base layer received from the scaler 105a, in addition to neighbor L frames of the enhanced layer prior to and subsequent to the current frame, as shown in
In an example embodiment of the present invention, five frames are used to produce each H frame.
When a picture of the base layer is selected as a reference picture for prediction of a picture of the enhanced layer in the reference picture selection method as shown in
The EL encoder 100 incorporates position information of the selected portion of the base layer picture into a header of the current picture coded into residual data. The EL encoder 100 also sets and inserts a flag “flag_base_layer_cropping”, which indicates that part of the base layer picture has been selected and used, in the picture header at an appropriate position so that the flag is delivered to the decoder. The position information is not transmitted when the flag “flag_base_layer_cropping” is reset.
a and 5b illustrate embodiments of the structure of information regarding a selected portion 512 of a base layer picture. In the embodiment of
The offsets in the information of the selected portion shown in
Since the offset fields of the information illustrated in
Specifically, with reference to
Furthermore, in the embodiment of
As described above, the information of
Information of the size and aspect ratio of the base layer picture, mode information of an actual image of the base layer picture, etc., can be determined by decoding, for example, from a sequence header of the encoded base layer stream. Namely, the information may be recorded in the sequence header of the encoded base layer stream. Accordingly, the position of an area overlapping with the enhanced layer picture, which corresponds to the base layer picture or the selected area in the base layer picture described above, are determined based on position or offset information, and all or part of the base layer picture is used to suit this determination.
Returning to
The data stream encoded in the method described above is transmitted by wire or wirelessly to a decoding apparatus or is delivered via recording media. The decoding apparatus reconstructs the original video signal in the enhanced and/or base layer according to the method described below.
The EL decoder 230 includes, as an internal element, an inverse filter that has a structure as shown in
The L frames output from the arranger 234 constitute an L frame sequence 601 of level N−1. A next-stage inverse updater and predictor of level N−1 reconstructs the L frame sequence 601 and an input H frame sequence 602 of level N−1 to an L frame sequence. This decoding process is performed the same number of times as the number of MCTF levels employed in the encoding procedure, thereby reconstructing an original video frame sequence. With reference to ‘reference selection code’ information carried in a header of each macroblock of an input H frame, the inverse predictor 232 specifies an L frame of the enhanced layer and/or an enlarged frame of the base layer which has been used as a reference frame to code the macroblock to residual data. The inverse predictor 232 determines a reference block in the specified frame based on a motion vector provided from the motion vector decoder 235, and then adds pixel values of the reference block (or average pixel values of two macroblocks used as reference blocks of the macroblock) to pixel difference values of the macroblock of the H frame; thereby reconstructing the original image of the macroblock of the H frame.
When a base layer picture has been used as a reference frame of a current H frame, the scaler 230a selects and enlarges an area in the base layer picture (in the example of
In the case where the information of
For one H frame, the MCTF decoding is performed in specified units, for example, in units of slices in a parallel fashion, so that the macroblocks in the frame have their original images reconstructed and the reconstructed macroblocks are then combined to constitute a complete video frame.
The above decoding method reconstructs an MCTF-encoded data stream to a complete video frame sequence. The decoding apparatus decodes and outputs a base layer sequence or decodes and outputs an enhanced layer sequence using the base layer depending on its processing and presentation capabilities.
The decoding apparatus described above may be incorporated into a mobile communication terminal, a media player, or the like.
Returning to
The phase shift can be defined as the phase difference between luma signals of the two layers. Typically, luma and chroma signals of the two layers are sampled so as to satisfy a position condition according to the ratio between the luma and chroma signals, and the luma signals of the two layers are sampled so as to be in phase with each other.
The phase shift can also be defined as the phase difference between chroma signals of the two layers. The phase difference between chroma signals of the two layers can be determined based on the difference between positions of the chroma signals of the two layers after the positions of the luma signals of the two layers are matched to each other so that the luma signals of the two layers are in phase with each other.
The phase shift can also be individually defined for each layer, for example, with reference to a single virtual layer (e.g., an upsampled base layer) based on the input video signal for generating the enhanced or base layer. Here, the phase difference is between luma and/or chroma samples (i.e., pixels) of the enhanced layer of the base layer and the virtual layer (e.g., an upsampled base layer).
The EL encoder 100 records the phase shift information transmitted from the downsampling unit 140 in a header area of a sequence layer or a slice layer. If the phase shift information has a value other than 0, the EL encoder 100 sets a global shift flag “global_shift_flag”, which indicates whether or not there is a phase shift between the two layers, to, for example, “1”, and records the value of the phase shift in information in fields “global_shift_x” and “global_shift_y”. The “global_shift_x” value represents the horizontal phase shift. The “global_shift_y” value represents the vertical phase shift. Stated another way, the “global_shift_x” value represents the horizontal position offset between the samples (i.e., pixels), and the “global_shift_y” represents the vertical position offset between the samples (i.e., pixels).
On the other hand, if the phase shift information has a value of 0, the EL encoder 100 sets the flag “global_shift_flag” to, for example, “0”, and does not record the values of the phase shift in the information fields “global_shift_x” and “global_shift_y”.
The EL encoder 100 also records the sampling-related information in the header area of the sequence layer or the slice layer if needed.
It will be recalled from the discussion of
The estimator/predictor 102 reconstructs an original image of the found corresponding block by decoding the intra-coded pixel values of the corresponding block, and then upsamples the found corresponding block to enlarge it by the ratio of the screen size of the enhanced layer to the screen size of the base layer. The estimator/predictor 102 performs this upsampling taking into account the phase shift information “global_shift_x/y” transmitted from the downsampling unit 140 so that the enlarged corresponding block of the base layer is in phase with the macroblock of the enhanced layer.
The estimator/predictor 102 encodes the macroblock with reference to a corresponding area in the corresponding block of the base layer, which has been enlarged so as to be in phase with the macroblock. Here, the term “corresponding area” refers to a partial area in the corresponding block which is at the same relative position in the frame as the macroblock.
If needed, the estimator/predictor 102 searches for a reference area more highly correlated with the macroblock in the enlarged corresponding block of the base layer by performing motion estimation on the macroblock while changing the phase of the corresponding block, and encodes the macroblock using the found reference area.
If the phase of the enlarged corresponding block is further changed while the reference area is searched for, the estimator/predictor 102 sets a local shift flag “local_shift_flag”, which indicates whether or not there is a phase shift, different from the global phase shift “global_shift_x/y”, between the macroblock and the corresponding upsampled block, to, for example, “1”. Also, the estimator/predictor 102 records the local shift flag in a header area of the macroblock and records the local phase shift between the macroblock and the corresponding block in information fields “local_shift_x” and “local_shift_y”. The local phase shift information may be replacement information, and provide the entire phase shift information as a replacement or substitute for the global phase shift information. Alternatively, the local phase shift information may be additive information, wherein the local phase shift information added to the corresponding global phase shift information provides the entire or total phase shift information.
The estimator/predictor 102 further inserts information indicating that the macroblock of the enhanced layer has been encoded in an intra BL mode in the header area of the macroblock so as to inform the decoder of the same.
The estimator/predictor 102 can also apply the inter-layer residual prediction method to a macroblock to contain residual data, which is data of an image difference, using a reference block found in other frames prior to and subsequent to the macroblock. Also in this case, the estimator/predictor 102 upsamples a corresponding block of the base layer encoded so as to contain residual data, which is data of an image difference, taking into account the phase shift information “global_shift_x/y” transmitted from the downsampling unit 140 so that the base layer is in phase with the enhanced layer. Here, the corresponding block of the base layer is a block which has been encoded so as to contain residual data, which is data of an image difference.
The estimator/predictor 102 inserts information indicating that the macroblock of the enhanced layer has been encoded according to the inter-layer residual prediction method in the header area of the macroblock so as to inform the decoder of the same.
The estimator/predictor 102 performs the above procedure for all macroblocks in the frame to complete an H frame which is a predictive image of the frame. The estimator/predictor 102 performs the above procedure for all input video frames or all odd ones of the L frames obtained at the previous level to complete H frames which are predictive images of the input frames.
As described above, the updater 103 adds an image difference of each macroblock in an H frame produced by the estimator/predictor 102 to an L frame having its reference block, which is an input video frame or an even one of the L frames obtained at the previous level.
The data stream encoded in the method described above is transmitted by wire or wirelessly to a decoding apparatus or is delivered via recording media. The decoding apparatus reconstructs the original video signal according to the method described below.
In order to decode a macroblock of the enhanced layer encoded according to the inter-layer prediction method, a block of the base layer corresponding to the macroblock is enlarged by the ratio of the screen size of the enhanced layer to the screen size of the base layer through upsampling. This upsampling is performed taking into account phase shift information “global_shift_x/y” in the enhanced layer and/or the base layer, so as to compensate for a global phase shift between the macroblock of the enhanced layer and the enlarged corresponding block of the base layer.
If there is a local phase shift “local_shift_x/y”, different from the global phase shift “global_shift_x/y”, between the macroblock of the enhanced layer and the corresponding block of the base layer, the corresponding block is upsampled taking into account the local phase shift “local_shift_x/y”. For example, the local phase shift information may be used instead of the global phase shift information in one embodiment, or alternatively, in addition to the global phase shift information in another embodiment.
Then, an original image of the macroblock of the enhanced layer is reconstructed using the corresponding block which has been enlarged so as to be in phase with the macroblock.
Returning to
More specifically, with reference to
If a local shift flag “local_shift_flag” indicates that there is a local phase shift “local_shift_x/y” different from the global phase shift “global_shift_x/y” between the macroblock and the corresponding block, the inverse predictor 232 upsamples the corresponding block taking into account the local phase shift “local_shift_x/y” (as substitute or additional phase shift information). The local phase shift information may be included in the header area of the macroblock.
If information indicating that a macroblock in an H frame has been encoded in an inter-layer residual mode is included in a header area of the macroblock, the inverse predictor 232 upsamples a corresponding block of the base layer encoded so as to contain residual data, taking into account the global phase shift “global_shift_x/y” as discussed above to enlarge the corresponding block so as to be in phase with the macroblock of the enhanced layer. The inverse predictor 232 then reconstructs residual data of the macroblock using the corresponding block enlarged so as to be in phase with the macroblock.
The inverse predictor 232 searches for a reference block of the reconstructed macroblock containing residual data in an L frame with reference to a motion vector provided from the motion vector decoder 233, and reconstructs an original image of the macroblock by adding pixel values of the reference block to difference values of pixels (i.e., residual data) of the macroblock.
All macroblocks in the current H frame are reconstructed to their original images in the same manner as the above operation, and the reconstructed macroblocks are combined to reconstruct the current H frame to an L frame. The arranger 234 alternately arranges L frames reconstructed by the inverse predictor 232 and L frames updated by the inverse updater 231, and outputs such arranged L frames to the next stage.
The above decoding method reconstructs an MCTF-encoded data stream to a complete video frame sequence. In the case where the prediction and update operations have been performed for a group of pictures (GOP) N times in the MCTF encoding procedure described above, a video frame sequence with the original image quality is obtained if the inverse update and prediction operations are performed N times in the MCTF decoding procedure. However, a video frame sequence with a lower image quality and at a lower bitrate may be obtained if the inverse update and prediction operations are performed less than N times. Accordingly, the decoding apparatus is designed to perform inverse update and prediction operations to the extent suitable for the performance thereof.
The decoding apparatus described above can be incorporated into a mobile communication terminal, a media player, or the like.
As is apparent from the above description, a method and apparatus for encoding/decoding a video signal according to the present invention uses pictures of a base layer provided for low-performance decoders, in addition to pictures of an enhanced layer, when encoding a video signal in a scalable fashion, so that the total amount of coded data is reduced, thereby increasing coding efficiency. In addition, part of a base layer picture, which can be used for a prediction operation of an enhanced layer picture, is specified so that the prediction operation can be performed normally without performance degradation even when a picture enlarged from the base layer picture cannot be directly used for the prediction operation of the enhanced layer picture.
As is apparent from the above description, a method for encoding and decoding a video signal according to the present invention increases coding efficiency by preventing a phase shift in a base layer and/or an enhanced layer caused in downsampling and upsampling procedures when encoding/decoding the video signal according to an inter-layer prediction method.
Furthermore, as will be apparent from the descriptions provided above, the encoding and decoding embodiments related to phase shift information may be used independently or in conjunction with the encoding and decoding embodiments related to offset information.
Although the example embodiments of the present invention have been disclosed for illustrative purposes, those skilled in the art will appreciate that various improvements, modifications, substitutions, and additions are possible, without departing from the scope and spirit of the invention.
Number | Date | Country | Kind |
---|---|---|---|
10-2005-0066622 | Jul 2005 | KR | national |
10-2005-0084729 | Sep 2005 | KR | national |
10-2005-0084742 | Sep 2005 | KR | national |
10-2005-0084744 | Sep 2005 | KR | national |
This application is a divisional under 35 U.S.C. §120/121 of application Ser. No. 11/657,012 filed Jan. 24, 2007, which is a continuation-in-part application of application Ser. Nos. 11/392,634, 11/401,318, 11/401,317, 11/392,674 and 11/392,673, filed Mar. 30, 2006, Apr. 11, 2006, Apr. 11, 2006, Mar. 30, 2006 and Mar. 30, 2006, respectively, claims priority under 35 U.S.C. §119 on U.S. Provisional Application No. 60/667,115, filed on Apr. 1, 2005; U.S. Provisional Application No. 60/670,246, filed on Apr. 12, 2005; U.S. Provisional Application No. 60/670,241, filed on Apr. 12, 2005; and U.S. Provisional Application No. 60/670,676, filed Apr. 13, 2005; and also claims priority under 35 U.S.C. §119 on Korean Patent Application Nos. 10-2005-0084744, 10-2005-0066622, 10-2005-0084729 and 10-2005-0084742, filed on Sep. 12, 2005, Jul. 22, 2005, Sep. 12, 2005 and Sep. 12, 2005, respectively, the entire contents of each of which are hereby incorporated by reference.
Number | Name | Date | Kind |
---|---|---|---|
4888641 | Isnardi et al. | Dec 1989 | A |
5418570 | Ueno et al. | May 1995 | A |
5650824 | Huang | Jul 1997 | A |
5712687 | Naveen et al. | Jan 1998 | A |
5973739 | Nilsson | Oct 1999 | A |
5995150 | Hsieh et al. | Nov 1999 | A |
6011584 | Allred et al. | Jan 2000 | A |
6057884 | Chen et al. | May 2000 | A |
6510177 | De Bonet et al. | Jan 2003 | B1 |
6510777 | Neal | Jan 2003 | B2 |
6535559 | Yagasaki et al. | Mar 2003 | B2 |
6549575 | Butter et al. | Apr 2003 | B1 |
6697426 | Van Der Schaar et al. | Feb 2004 | B1 |
6728317 | Demos | Apr 2004 | B1 |
6788347 | Kim et al. | Sep 2004 | B1 |
6804299 | Moni et al. | Oct 2004 | B2 |
6836512 | Van Der Schaar et al. | Dec 2004 | B2 |
6847685 | Fujiwara et al. | Jan 2005 | B1 |
6931063 | Sun et al. | Aug 2005 | B2 |
7062096 | Lin et al. | Jun 2006 | B2 |
7136417 | Rodriguez | Nov 2006 | B2 |
7203235 | Huang et al. | Apr 2007 | B2 |
7450641 | Sun et al. | Nov 2008 | B2 |
20030035484 | Prakash et al. | Feb 2003 | A1 |
20040008790 | Rodriguez | Jan 2004 | A1 |
20040114689 | Zhang et al. | Jun 2004 | A1 |
20040136352 | Fu et al. | Jul 2004 | A1 |
20060126962 | Sun | Jun 2006 | A1 |
20070086515 | Kirkenko et al. | Apr 2007 | A1 |
20070116131 | Sun | May 2007 | A1 |
20070140354 | Sun | Jun 2007 | A1 |
Number | Date | Country |
---|---|---|
1209020 | Feb 1999 | CN |
1526240 | Sep 2004 | CN |
101176349 | Sep 2010 | CN |
1997-0064261 | Sep 1997 | KR |
1999-0070553 | Sep 1999 | KR |
2002-0064932 | Aug 2002 | KR |
2003-0020382 | Mar 2003 | KR |
10-2003-0089505 | Nov 2003 | KR |
10-2004-0096548 | Nov 2004 | KR |
10-2004-0107437 | Dec 2004 | KR |
2005-0021487 | Mar 2005 | KR |
2005-0049644 | May 2005 | KR |
233306 | May 2005 | TW |
WO-0143447 | Jun 2001 | WO |
WO-03047260 | Jun 2003 | WO |
WO-2006104364 | Oct 2006 | WO |
Entry |
---|
Li et al., An Arbitrary Ratio Resizer for MPEG Applications, 2000, Retrieved from the Internet <URL: ieeexplore.ieee.org/xpls/abs—all.jsp?arnumber=854494&tag=1>,pp. 1-2 as printed. |
MPEG (Richardson), H.264 and MPEG-4 Video Compression: (chapter 3 Video Coding Concepts), 2004, Wiley, Retrieved form the Internet URL: onlinelibrary.wiley.com/book/10.1002/0470869615>, pp. 1-57 as printed. |
MPEG (Richardson), H.264 and MPEG-4 Video Compression: (chapter 5 MPEG-4 Visual), 2004, Wiley, Retrieved form the Internet URL: onlinelibrary.wiley.com/book/10.1002/0470869615., pp. 1-59 as printed. |
JVT, JVT-1047r1.doc, 2003, Retrieved form the Internet <URL:wftp3.itu.int/av-arch/jvt-site/2003—09—SanDiego/JVT-1047r1.doc>, pp. 1-28 as printed. |
S-W Park et al., “Intra BL prediction considering phase shift,” Joint Video Team (JVT) of ISO/IEC MPEG & ITU-T VCEG, pp. 1-16, Apr. 16, 2005. |
Francois, E. et al., “Extended Spatial Scalability,” MPEG Meeting, Jan. 12, 2005. |
“H.262/MPEG-2, Coding of moving video, 2nd Edition,” ITU-T Recommendation Series H, Feb. 1, 2000. |
Ying C. et al., “New 4:2:0 format,” JVT Meeting, Joint Video Team (JVT) of ISO/IEC MPEG & ITU-T VCEG, pp. 1-13, May 24, 2005. |
European Search Report dated Oct. 24, 2012. |
U.S. Office Action mailed Mar. 6, 2012 for U.S. Appl. No. 11/657,011. |
U.S. Office Action mailed Apr. 2, 2012 for U.S. Appl. No. 11/657,043. |
Office Action for corresponding U.S. Appl. No. 11/657,043 dated Jun. 17, 2011. |
Office Action for U.S. Appl. No. 11/401,317 dated Aug. 17, 2009. |
Office Action for U.S. Appl. No. 12/418,129 dated Sep. 8, 2009. |
Office Action for Korean Application No. 2007-7025373 dated Dec. 23, 2008. |
Office Action for Korean Application No. 2007-7025371 dated Dec. 23, 2008. |
Notice of Allowance for corresponding Korean Application No. 10-2008-7028742 dated Apr. 21, 2011. |
Office Action for corresponding Taiwanese Application No. 095111664 dated Apr. 10, 2009 and English translation thereof. |
Search Report for corresponding European Application No. 06732851.8 dated Feb. 1, 2011. |
JVT: “Description of Core Experiments in SVC,” ITU Study Group 16—Video Coding Experts Group—ISO/IEC MPEG & ITU-T VCEG (ISO/IEC JTC1/SC29/SG11 and ITU-SG16Q6), No. JVT-N025d0, Feb. 23, 2005, XP030005941. |
ISO/IEC CD 13818-2 (MPEG 2 Part 2) ED—International Standards Organization: “ISO/IEC CD 13818-2 (MPEG 2 Part 2): Coding of Audio, Picture, Multimedia and Hypermedia Information—Part 2: Video, Passage Text,” January 1, 1993, Coding of Audio, Picture, Multimedia and Hypermedia Information. Dec. 1, 1993. ISO/IEC JTC1/SC29 N659. ISO/EIC CD 13818-2: Information Technology—Generic Coding of Moving Pictures and Associated Audio Information—Part 2: Video, Tokyo, ISO, JP, XP002050744. |
Cois et al., “Requirement for Extended Spatial Scalability for SVC,” Group ISO/IEC MPEG & ITU-VCEG (ISO/IEC JTC1/SC29/WG11 and ITU-T SG16 Q6), No. M11668, Jan. 12, 2005, XP 030040413. |
Office Action for Korean Application No. 2007-7025370 dated Jan. 15, 2009. |
Office Action for Korean Application No. 2007-7025372 dated Jan. 16, 2009. |
Office Action for corresponding Taiwanese Application No. 95111664 dated Feb. 2, 2010. |
Notice of Allowance for U.S. Appl. No. 12/418,129 dated Oct. 23, 2009. |
Search Report for International Application No. PCT/KR2006/001200 dated Jun. 26, 2006. |
Search Report for International Application No. PCT/KR2006/001339 dated Jun. 28, 2006. |
Search Report for International Application No. PCT/KR2006/001341 dated Jul. 24, 2006. |
Office Action for corresponding U.S. Appl. No. 11/392,634 dated Nov. 16, 2009. |
Office Action for corresponding U.S. Appl. No. 11/392,674 dated Nov. 19, 2009. |
Number | Date | Country | |
---|---|---|---|
20140247877 A1 | Sep 2014 | US |
Number | Date | Country | |
---|---|---|---|
60670246 | Apr 2005 | US | |
60670241 | Apr 2005 | US | |
60667115 | Apr 2005 | US | |
60670676 | Apr 2005 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 11657012 | Jan 2007 | US |
Child | 14274113 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 11401318 | Apr 2006 | US |
Child | 11657012 | US | |
Parent | 11401317 | Apr 2006 | US |
Child | 11401318 | US | |
Parent | 11392673 | Mar 2006 | US |
Child | 11401317 | US | |
Parent | 11392634 | Mar 2006 | US |
Child | 11392673 | US | |
Parent | 11392674 | Mar 2006 | US |
Child | 11392634 | US |