The present principles relate generally to video encoding and decoding and, more particularly, to methods and apparatus for sampling-based super resolution video encoding and decoding.
A video compression approach using super resolution was proposed in a first prior art approach. In the first prior art approach, the spatial size of the input video is reduced to a certain predetermined low resolution (LR) size before encoding. After the low resolution video is received at the decoder side, the low resolution video is up-scaled to the original size using a super resolution method along with some side information (metadata) transmitted with the bitstream. The metadata includes a block-based segmentation of frames where each block is labeled as moving, non-moving flat, and non-moving textured. Non-moving flat blocks are up-scaled by spatial interpolation. For moving blocks, motion vectors are sent to the receiver where a super resolution technique is applied in order to recover sub-pixel information. For non-moving textured blocks, a jittered down-sampling strategy is used wherein four complementary down-sampling grids are applied in rotating order.
However, the aforementioned first prior art approach disadvantageously does not use a smart sampling strategy for moving regions. Rather, the first prior art approach relies on the presence of sub-pixel motion between the low resolution frames in order to obtain super resolution. However, sub-pixel motion is not always guaranteed.
In a second prior art approach, a camera is mechanically moved in sub-pixel shifts between frame captures. The goal is to capture low resolution video which is better suited for subsequent super resolution. For static backgrounds, the method of the second prior art approach is analogous to the jittered sampling idea in aforementioned first prior art approach. However, a fixed jitter is not an effective strategy for the case of non-static backgrounds which is likely in our targeted application, namely, down-sampling high resolution video for subsequent super resolution.
These and other drawbacks and disadvantages of the prior art are addressed by the present principles, which are directed to methods and apparatus for sampling-based super resolution video encoding and decoding.
According to an aspect of the present principles, there is provided an apparatus. The apparatus includes a down-sampler and metadata generator for receiving high resolution pictures and generating low resolution pictures and metadata there from. The metadata is for guiding post-decoding post-processing of the low resolution pictures and the metadata. The apparatus further includes at least one encoder for encoding the low resolution pictures and the metadata.
According to another aspect of the present principles, there is provided a method. The method includes receiving high resolution pictures and generating low resolution pictures and metadata there from. The metadata is for guiding post-decoding post-processing of the low resolution pictures and the metadata. The method further includes encoding the low resolution pictures and the metadata using at least one encoder.
According to yet another aspect of the present principles, there is provided an apparatus. The apparatus includes a decoder for receiving a bitstream and decoding low resolution pictures and metadata there from. The apparatus further includes a super resolution post-processor for reconstructing high resolution pictures respectively corresponding to the low resolution pictures using the low resolution pictures and the metadata.
According to still another aspect of the present principles, there is provided a method. The method includes receiving a bitstream and decoding low resolution pictures and metadata there from using a decoder. The method further includes reconstructing high resolution pictures respectively corresponding to the low resolution pictures using the low resolution pictures and the metadata.
These and other aspects, features and advantages of the present principles will become apparent from the following detailed description of exemplary embodiments, which is to be read in connection with the accompanying drawings.
The present principles may be better understood in accordance with the following exemplary figures, in which:
The present principles are directed to methods and apparatus for sampling-based super resolution video encoding and decoding.
The present description illustrates the present principles. It will thus be appreciated that those skilled in the art will be able to devise various arrangements that, although not explicitly described or shown herein, embody the present principles and are included within its spirit and scope.
All examples and conditional language recited herein are intended for pedagogical purposes to aid the reader in understanding the present principles and the concepts contributed by the inventor(s) to furthering the art, and are to be construed as being without limitation to such specifically recited examples and conditions.
Moreover, all statements herein reciting principles, aspects, and embodiments of the present principles, as well as specific examples thereof, are intended to encompass both structural and functional equivalents thereof. Additionally, it is intended that such equivalents include both currently known equivalents as well as equivalents developed in the future, i.e., any elements developed that perform the same function, regardless of structure.
Thus, for example, it will be appreciated by those skilled in the art that the block diagrams presented herein represent conceptual views of illustrative circuitry embodying the present principles. Similarly, it will be appreciated that any flow charts, flow diagrams, state transition diagrams, pseudocode, and the like represent various processes which may be substantially represented in computer readable media and so executed by a computer or processor, whether or not such computer or processor is explicitly shown.
The functions of the various elements shown in the figures may be provided through the use of dedicated hardware as well as hardware capable of executing software in association with appropriate software. When provided by a processor, the functions may be provided by a single dedicated processor, by a single shared processor, or by a plurality of individual processors, some of which may be shared. Moreover, explicit use of the term “processor” or “controller” should not be construed to refer exclusively to hardware capable of executing software, and may implicitly include, without limitation, digital signal processor (“DSP”) hardware, read-only memory (“ROM”) for storing software, random access memory (“RAM”), and non-volatile storage.
Other hardware, conventional and/or custom, may also be included. Similarly, any switches shown in the figures are conceptual only. Their function may be carried out through the operation of program logic, through dedicated logic, through the interaction of program control and dedicated logic, or even manually, the particular technique being selectable by the implementer as more specifically understood from the context.
In the claims hereof, any element expressed as a means for performing a specified function is intended to encompass any way of performing that function including, for example, a) a combination of circuit elements that performs that function or b) software in any form, including, therefore, firmware, microcode or the like, combined with appropriate circuitry for executing that software to perform the function. The present principles as defined by such claims reside in the fact that the functionalities provided by the various recited means are combined and brought together in the manner which the claims call for. It is thus regarded that any means that can provide those functionalities are equivalent to those shown herein.
Reference in the specification to “one embodiment” or “an embodiment” of the present principles, as well as other variations thereof, means that a particular feature, structure, characteristic, and so forth described in connection with the embodiment is included in at least one embodiment of the present principles. Thus, the appearances of the phrase “in one embodiment” or “in an embodiment”, as well any other variations, appearing in various places throughout the specification are not necessarily all referring to the same embodiment.
It is to be appreciated that the use of any of the following “/”, “and/or”, and “at least one of”, for example, in the cases of “A/B”, “A and/or B” and “at least one of A and B”, is intended to encompass the selection of the first listed option (A) only, or the selection of the second listed option (B) only, or the selection of both options (A and B). As a further example, in the cases of “A, B, and/or C” and “at least one of A, B, and C”, such phrasing is intended to encompass the selection of the first listed option (A) only, or the selection of the second listed option (B) only, or the selection of the third listed option (C) only, or the selection of the first and the second listed options (A and B) only, or the selection of the first and third listed options (A and C) only, or the selection of the second and third listed options (B and C) only, or the selection of all three options (A and B and C). This may be extended, as readily apparent by one of ordinary skill in this and related arts, for as many items listed.
Also, as used herein, the words “picture” and “image” are used interchangeably and refer to a still image or a picture from a video sequence. As is known, a picture may be a frame or a field.
Additionally, as used herein, the words “surrounding co-located pixels” when used, for example, with respect to creating the high resolution mosaic described herein by interpolating pixel values at pixel positions in the high resolution mosaic from pixel values of “surrounding co-located pixels” in the low resolution pictures, refers to pixels in the low resolution pictures that surround a particular pixel that is co-located (i.e., has the same position) as a target pixel currently being interpolated in the high resolution mosaic.
As noted above, the present principles are directed to methods and apparatus for sampling-based super resolution video encoding and decoding. It is to be appreciated that the present principles advantageously improve video compression efficiency. In particular, a smart down-sampling strategy which is capable of handling motion between frames is proposed. At the pre-processing stage, high-resolution (HR) frames are down-sampled to low-resolution (LR) and metadata is generated to guide post-processing. In the post-processing stage, the decoded low resolution frames and received metadata are used within a novel super resolution framework to reconstruct high resolution frames. Since only low resolution frames are encoded and the amount of metadata transmitted is low to moderate, this approach has the potential to provide increased compression ratios.
The smart down-sampling strategy takes into account motion between frames. The down-sampling strategy contributes to an improved super resolution result by creating LR frames such that they complement one another in the pixel information they carry (in other words, reducing pixel redundancy between frames). In some sense, the strategy attempts to enforce sub-pixel motion between frames.
We note that conventional video compression methods (mainly block-based prediction methods such as, for example, the International Organization for Standardization/International Electrotechnical Commission (ISO/IEC) Moving Picture Experts Group-4 (MPEG-4) Part 10 Advanced Video Coding (AVC) Standard/International Telecommunication Union, Telecommunication Sector (ITU-T) H.264 Recommendation (hereinafter the “MPEG-4 AVC Standard”), have started reaching saturation points in compression ratios. Data pruning methods aim at improving compression efficiency beyond that achieved by standard compression methods. The main principle of such methods is to remove data before (or during) encoding and putting back the removed data at the receiver after (or during) decoding. Data pruning methods have exploited a variety of pre- and post-processing techniques for achieving their goal, e.g., block/region removal and inpainting, line removal and interpolation, and so forth.
In accordance with the present principles, intelligent down-sampling (at the transmitter) and super resolution (at the receiver) are the techniques exploited for data pruning. Super resolution is the process of increasing the resolution of images or videos by temporally integrating information across several low resolution images or frames. The principle of this data pruning approach is illustrated in
While not limited to the specific configurations of the following described encoder and decoder, encoder 152 and decoder 153 can be respectively implemented as shown in
Turning to
A first output of an encoder controller 205 is connected in signal communication with a second input of the frame ordering buffer 210, a second input of the inverse transformer and inverse quantizer 250, an input of a picture-type decision module 215, a first input of a macroblock-type (MB-type) decision module 220, a second input of an intra prediction module 260, a second input of a deblocking filter 265, a first input of a motion compensator 270, a first input of a motion estimator 275, and a second input of a reference picture buffer 280.
A second output of the encoder controller 205 is connected in signal communication with a first input of a Supplemental Enhancement Information (SEI) inserter 230, a second input of the transformer and quantizer 225, a second input of the entropy coder 245, a second input of the output buffer 235, and an input of the Sequence Parameter Set (SPS) and Picture Parameter Set (PPS) inserter 240.
An output of the SEI inserter 230 is connected in signal communication with a second non-inverting input of the combiner 290.
A first output of the picture-type decision module 215 is connected in signal communication with a third input of the frame ordering buffer 210. A second output of the picture-type decision module 215 is connected in signal communication with a second input of a macroblock-type decision module 220.
An output of the Sequence Parameter Set (SPS) and Picture Parameter Set (PPS) inserter 240 is connected in signal communication with a third non-inverting input of the combiner 290.
An output of the inverse quantizer and inverse transformer 250 is connected in signal communication with a first non-inverting input of a combiner 219. An output of the combiner 219 is connected in signal communication with a first input of the intra prediction module 260 and a first input of the deblocking filter 265. An output of the deblocking filter 265 is connected in signal communication with a first input of a reference picture buffer 280. An output of the reference picture buffer 280 is connected in signal communication with a second input of the motion estimator 275 and a third input of the motion compensator 270. A first output of the motion estimator 275 is connected in signal communication with a second input of the motion compensator 270. A second output of the motion estimator 275 is connected in signal communication with a third input of the entropy coder 245.
An output of the motion compensator 270 is connected in signal communication with a first input of a switch 297. An output of the intra prediction module 260 is connected in signal communication with a second input of the switch 297. An output of the macroblock-type decision module 220 is connected in signal communication with a third input of the switch 297. The third input of the switch 297 determines whether or not the “data” input of the switch (as compared to the control input, i.e., the third input) is to be provided by the motion compensator 270 or the intra prediction module 260. The output of the switch 297 is connected in signal communication with a second non-inverting input of the combiner 219 and an inverting input of the combiner 285.
A first input of the frame ordering buffer 210 and an input of the encoder controller 205 are available as inputs of the encoder 200, for receiving an input picture. Moreover, a second input of the Supplemental Enhancement Information (SEI) inserter 230 is available as an input of the encoder 200, for receiving metadata. An output of the output buffer 235 is available as an output of the encoder 200, for outputting a bitstream.
Turning to
A second output of the entropy decoder 345 is connected in signal communication with a third input of the motion compensator 370, a first input of the deblocking filter 365, and a third input of the intra predictor 360. A third output of the entropy decoder 345 is connected in signal communication with an input of a decoder controller 305. A first output of the decoder controller 305 is connected in signal communication with a second input of the entropy decoder 345. A second output of the decoder controller 305 is connected in signal communication with a second input of the inverse transformer and inverse quantizer 350. A third output of the decoder controller 305 is connected in signal communication with a third input of the deblocking filter 365. A fourth output of the decoder controller 305 is connected in signal communication with a second input of the intra prediction module 360, a first input of the motion compensator 370, and a second input of the reference picture buffer 380.
An output of the motion compensator 370 is connected in signal communication with a first input of a switch 397. An output of the intra prediction module 360 is connected in signal communication with a second input of the switch 397. An output of the switch 397 is connected in signal communication with a first non-inverting input of the combiner 325.
An input of the input buffer 310 is available as an input of the decoder 300, for receiving an input bitstream. A first output of the deblocking filter 365 is available as an output of the decoder 300, for outputting an output picture.
Principle of Sampling-Based Super Resolution
The central idea of sampling-based SR is illustrated in
Turning to
Referring to
Referring to
In the following, we shall further describe the steps involved in the pre-processing and post-processing stages.
Pre-Processing Stage of Sampling-Based Super Resolution
In the pre-processing stage, the input high resolution video is first divided into sets of contiguous frames. Each set is then processed separately. Typically, we choose M2 frames in each set where M is the down-sampling factor, i.e., the ratio of high resolution to low resolution frame dimensions. The reasoning here is that a high resolution frame includes M2 times the number of pixels as a low resolution frame, therefore it should take M2 LR frames to construct a super resolution mosaic with the same size as a high resolution frame.
Let us now consider the case where the down-sampling factor is 2 (i.e., M=2) and then consider a set of four high resolution frames (Ht; t=1, 2, 3, 4).
Turning to
Further details relating to the steps involved in the pre-processing stage (e.g., as shown with respect to
1. Motion estimation: Let H1 be the reference frame. Estimate the motion from each frame Ht to the reference frame (
2. Sampling grid selection: For each frame Ht, the sampling grid St indicates the pixels that are taken from Ht in order to create the corresponding LR frame Lt. The grids St are chosen such that each frame provides complementary pixel information for the super resolution process in the post-processing stage (
3. Down-sampling: Using the selected grids St, each of the low resolution frames Lt is created. The low resolution frames are then compressed using an encoder and sent to the receiver. Information regarding the motion between the frames and the sampling grids used are also sent as metadata.
Each of the preceding steps will be further described herein after.
Motion Estimation
For illustrative purposes, we will now discuss one way of estimating the motion between each frame Ht in a given set to the reference frame of the set (
In order to estimate the motion from frame Hi to frame Hj, we first choose a parametric global motion model that describes the motion between frames. Using the data from Hi and Hj, the parameters θij of the model are then determined. Henceforth, we shall denote the transformation by Θij and its parameters by θij. The transformation Θij can then be used to align (or warp) Hi to Hj (or vice versa using the inverse model Θji=Θij−1).
Global motion can be estimated using a variety of models and methods. One commonly used model is the projective transformation given as follows:
The above equations give the new position (x′, y′) in Hj to which the pixel at (x, y) in Hi has moved. Thus, the eight model parameters θij={a1, a2, a3, b1, b2, b3, c1, c2} describe the motion from Hi to Hj. The parameters are usually estimated by first determining a set of point correspondences between the two frames and then using a robust estimation framework such as RANdom SAmple Consensus (RANSAC) or its variants. Point correspondences between frames can be determined by a number of methods, e.g. extracting and matching Scale-invariant Feature Transform (SIFT) features or using optical flow.
For the sampling-based super resolution procedure, the motion between each frame Ht to the reference frame (H1) has to be estimated. Hence, three sets of parameters are estimated: θ21; θ31; and θ41 (corresponding to transformations Θ21, Θ31 and Θ41, respectively). The transformation is invertible and the inverse model Θij=Θij−1 describes the motion from Hj to Hi.
Sampling Grid Selection
For each high resolution frame Ht, a sampling grid St has to be selected in order to down-sample the frame and create the low resolution version Lt. A sampling grid indicates the pixels in the high resolution frame that are taken and packed into the corresponding low resolution frame. Turning to
Turning to
Let us constrain ourselves here to use only uniform sampling grids, i.e., those that have a uniform density of coverage across all portions of the high resolution frame. There are distinct advantages of using a uniform grid. First, it roughly preserves the spatial and temporal relationships present among pixels in the high resolution frames and this helps the encoder (e.g., encoder 115 in
The sampling grid selection process is posed as a problem of selecting, for each high resolution frame Ht, an appropriate sampling grid St from a candidate pool of grids G={gi; i=1, . . . , NG}. In one embodiment, we choose from 12 candidate grids g1-g12 shown in
The basic criterion we shall employ in selecting grids is to maximize the expected quality of the super resolution result (i.e., the super resolution mosaic) at the post-processing stage. In practice, this is achieved by choosing grids St such that each frame provides complementary pixel information for the super resolution process. The grid selection process proceeds by replicating part of the super resolution mosaic creation process. In one embodiment, the criterion used to select grids is the super resolution filling factor.
Turning to
The preceding method 900 for sampling grid selection may also be further described as follows (presuming a set of four frames, H1 being the reference frame):
1. Compute the motion transformations Θt1 between each frame Ht to the reference frame (H1).
2. Choose the sampling grid for the reference frame as S1=g1.
3. Initialize an “unfilled” super resolution frame (HSR) in the coordinates of the reference frame (i.e. assuming there is no motion between HSR and H1). “Fill” the pixels in HSR corresponding to pixel positions given by grid S1.
4. For each remaining HR frame Ht (t≠1), compute the filling factor of each possible candidate grid in G. The filling factor of a candidate grid gi is defined as the number of previously unfilled pixels in HSR that are filled when gi is selected for Ht. The grid gi* that results in the highest filling factor is then selected (i.e., Si=gi*) and the corresponding pixels in Ht are filled (taking into account the motion transformation Θt1).
5. If all the frames Ht in the set have been processed, terminate. Otherwise, go back to step 4.
In step 4, the filling factor of a candidate grid gi is computed as follows. First consider each grid giεG in turn for Ht, transform (move) the pixels given by gi to HSR using Θt1 (rounding to the nearest pixel position in HSR), and compute the filling factor by recording how many previously unfilled pixel positions in HSR are filled by the transformed pixels. Thereafter, the grid gi* that results in the highest filling factor is selected (i.e. St=gi*). Note that the selected grids St and the resulting super resolution quality may depend on the order in which the frames Ht are processed. One ordering strategy is to consider frames in increasing order of their temporal distance from the reference frame. For example, if H2 is the reference frame, then the other frames are processed in the following order: H1; H3; and H4.
Variations of the filling factor measure or entirely different metrics involving super resolution quality may be used as criteria for grid selection. For example, instead of declaring each pixel in HSR as being filled or unfilled, we could keep track of the number of grid pixels mapped to each pixel therein. Thereafter, the filling factor could be redefined as a measure of incremental information wherein grids that have greater incremental contribution to HSR score higher. Another criterion for grid selection could involve completely replicating the super resolution process (using the previously selected grids S1-St−1 and the current candidate grids for St) and choose a grid St that results in the highest SR quality, e.g., based on PSNR with respect to the reference frame.
Down-Sampling High Resolution to Low Resolution
After the grid selection process, each high resolution frame Ht has a corresponding sampling grid St. Depending on the nature of St, Ht is down-sampled to the low resolution frame Lt as follows:
For uniform sampling grids with various other structures, a suitable packing strategy may be devised so as to form a rectangular low resolution frame using the pixels sampled from the high resolution frame.
The low resolution frames thus created are then compressed using a video encoder. The side information including the estimated motion transformation parameters (θ21, θ31, θ41) and the selected sampling grids (S1, S2, S3, S4) are transmitted as metadata. Note here that it is sufficient to send the sampling grid indices instead of the grids themselves (i.e., if St=gi, send i). The grids are then known from a lookup table at the post-processing stage.
Post-Processing Stage of Sampling-Based SR
At the post-processing stage, we use the decoded low resolution frames and the metadata to reconstruct the corresponding high resolution frames, a process known as super resolution (SR). Turning to
Suppose that we have a set of decoded LR frames {circumflex over (L)}t corresponding to the set of high resolution frames Ht (t=1, 2, 3, 4) at the pre-processing stage (
1. Creation of super resolution mosaic from low resolution frames: In this step, a high-resolution “SR” mosaic image ĤSR is created using the pixels from the set of decoded low resolution frames and the side-information. This will serve as a reference image from which the HR frames will be reconstructed. In further detail, a portion of each reconstructed HR frame will come from the SR mosaic and the remaining portions will be spatially interpolated from the corresponding LR frame pixels.
2. Reconstruction of high resolution frames: Each high resolution frame Ĥt in the set is reconstructed using the super resolution mosaic image ĤSR and the low resolution frame {circumflex over (L)}t using the side-information to guide the process.
These steps are further explained herein below.
Creation of Super Resolution Mosaic from Low Resolution Frames
In this step, a high-resolution super resolution mosaic image ĤSR is constructed using the set of decoded low resolution frames {circumflex over (L)}t (t=1, 2, 3, 4) and the associated metadata, which comprises the grids St used to create the low resolution frames Lt and the transformations Θt1 from each frame to the reference frame in the set (Frame at t=1 in
1. For the time being, consider ĤSR to be a continuous 2-D pixel space wherein non-integer pixel positions may exist, e.g., ĤSR (1.44, 2.35)=128.
2. Fill in the pixel positions in ĤSR given by the transformed grid positions Θt1(St) with the corresponding pixel values in the decoded low resolution frame {circumflex over (L)}t. Do this for each decoded low resolution frame in the set (t=1, 2, 3, 4). Note that Θ11=I (identity transformation) since there is no motion between ĤSR and Ĥ1.
3. Finally an image ĤSR is constructed by interpolating the pixel values at all integer pixel positions where sufficient (e.g., as determined using a threshold) data is available, from the surrounding pixel values at each of those positions. A variety of (non-uniform) spatial interpolation methods are available for this operation. These methods take a set of pixel positions and corresponding values, and output interpolated values at any number of other positions. The grid data function of MATLAB can be used to carry out this interpolation.
The result of the above steps is the super resolution mosaic image ĤSR. In addition, a validity map may be computed to determine which pixels of ĤSR include reliable information so that only these pixels are used in the reconstruction of high resolution frames. A measure of validity may be computed at each pixel of the mosaic image based on the samples (e.g., the number or density of the samples) in a neighborhood around the pixel. Thereafter, a pixel in the mosaic is used in the reconstruction process only if its validity value is high enough (e.g., above a given threshold).
Reconstruction of High Resolution Frames
Now each high resolution frame Ĥt (t=1, 2, 3, 4) is reconstructed as follows:
1. For the time being, consider Ĥt to be a continuous 2-D pixel space wherein non-integer pixel positions may exist. Fill in the pixel positions in Ĥt given by the grid St with the corresponding pixel values in {circumflex over (L)}t.
2. Transform the pixel positions in ĤSR using the motion transformation Θ1t. Note that Θ1t is the inverse transformation of Θt1. If an integer pixel position x in ĤSR maps to a position y in the Ĥt space after transformation [i.e., y=Θ1t(x)], then fill in y with the corresponding value in ĤSR, i.e., Ĥt (y)=ĤSR(x).
3. Finally, the high resolution frame Ĥt is reconstructed by interpolating the pixel values at all integer pixel positions in the frame from the surrounding pixel values at each of those positions. This is handled using a spatial interpolation method as described in the previous section (step 3). Pixels outside the frame boundaries are not determined.
Handling Foreground Objects
So far, we have assumed that the motion between frames is fully described by a global motion model, i.e., all pixels adhere to this motion model. We now present a strategy to handle foreground objects. Foreground objects are defined as objects (or regions) that do not follow the global motion between frames. In other words, these objects have motions that are different from the global motion between frames. Turning to
Suppose we have obtained a binary mask Ft (as shown in
In addition to the above, other criteria using the foreground information may be used to improve the quality of the result.
Foreground Mask Estimation
It is a difficult problem to extract a clean and reliable foreground mask from frames with independently moving regions. Errors in global motion estimation along with the noise in the pixel values complicate the process. Furthermore, there is also the issue of compactly representing and transmitting the foreground information as metadata to the decoder.
One method for extracting foreground masks Ft for each high resolution frame Ĥt is now described. This takes place in the pre-processing stage where the high resolution frames are available. The following are the steps in the process.
1. For frame H1, the mask F1 is filled with zeros. In other words, all pixels are considered as background.
2. To extract Ft, the frame Ĥt is compared with H1t=Θ1t(H1), i.e., H1 is transformed to the coordinates of Ĥt. A normalized correlation metric Nt1(x) is computed between each pixel x in Ĥt and the corresponding pixel in H1t considering a small neighborhood around the pixels. If there is no corresponding pixel in H1t, (i.e., Θt1(x) lies outside the boundaries of H1), then Ft(x) is set to 1. Otherwise, if Nt1(x)>T, where T is a chosen threshold, then Ft(x)=0. Otherwise, Ft(x)=1.
Other methods including variations of the above may be used instead.
If the masks are computed at the pre-processing stage, they have to be transmitted as side-information to the receiver. It may not be necessary to transmit a high resolution version of the foreground masks. The masks may be down-sampled to low resolution using the same strategy as used to create the low resolution frames Lt from Ĥt, and then up-sampled at the post-processing stage. The masks may also be compressed (e.g., using ZIP, the MPEG-4 AVC Standard, and/or any other data compression scheme) prior to transmission. Alternatively, transmitting the masks may be entirely avoided by computing them at the receiver side using the decoded low resolution frames and the metadata. However, it is a difficult problem to compute a reliable mask at the receiver.
We note the following possible variations that may be employed in one or more embodiments of the present principles and still remain within the scope of the present invention, as would be apparent to one skilled in the art.
A description will now be given of some of the many attendant advantages/features of the present invention, some of which have been mentioned above. For example, one advantage/feature is an apparatus having a down-sampler and metadata generator and at least one encoder. The down-sampler and metadata generator is for receiving high resolution pictures and generating low resolution pictures and metadata there from. The metadata is for guiding post-decoding post-processing of the low resolution pictures and the metadata. The at least one encoder (152) is for encoding the low resolution pictures and the metadata.
Another advantage/feature is the apparatus having the down-sampler and metadata generator and the at least one encoder as described above, wherein the metadata includes motion transformation information and sampling grid information.
Yet another advantage/feature is the apparatus having the down-sampler and metadata generator and the at least one encoder wherein the metadata includes motion transformation information and sampling grid information as described above, wherein the motion transformation information comprises global motion transformation information relating to global motion between two or more of the high resolution pictures.
Still another advantage/feature is the apparatus having the down-sampler and metadata generator and the at least one encoder wherein the metadata includes motion transformation information and sampling grid information as described above, wherein the sampling grid information comprises sampling grid indices for indicating each respective one of a plurality of down-sampling grids used to generate the low resolution pictures from the high resolution pictures by down-sampling.
A further advantage/feature is the apparatus having the down-sampler and metadata generator and the at least one encoder as described above, wherein the high resolution pictures includes at least one reference picture and one or more non-reference pictures, and the down-sampler and metadata generator generates the low resolution pictures by estimating motion from a reference picture to each of the one or more non-reference pictures, selecting one or more down-sampling grids from a plurality of candidate down-sampling grids for use in down-sampling the high resolution pictures based on the motion information, and down-sampling the high resolution pictures using the one or more down-sampling grids.
Moreover, another advantage/feature is the apparatus having the down-sampler and metadata generator and the at least one encoder wherein the high resolution pictures includes at least one reference picture and one or more non-reference pictures, and the down-sampler and metadata generator generates the low resolution pictures by estimating motion from a reference picture to each of the one or more non-reference pictures, selecting one or more down-sampling grids from a plurality of candidate down-sampling grids for use in down-sampling the high resolution pictures based on the motion information, and down-sampling the high resolution pictures using the one or more down-sampling grids as described above, wherein the one or more down-sampling grids are selected based on the motion information such that each of the high resolution pictures, when down-sampled using the one or more down-sampling grids, provide complementary pixel information for the post-decoding post-processing of the low resolution pictures.
Further, another advantage/feature is the apparatus having the down-sampler and metadata generator and the at least one encoder wherein the one or more down-sampling grids are selected based on the motion information such that each of the high resolution pictures, when down-sampled using the one or more down-sampling grids, provide complementary pixel information for the post-decoding post-processing of the low resolution pictures as described above, wherein the grids are further selected based upon a filling factor that indicates a number of previously unfilled pixels in a super resolution picture generated using a particular one of the one or more down-sampling grids, the super resolution picture corresponding to an output provided by the post-decoding post-processing of the low resolution pictures and the metadata.
Also, another advantage/feature is the apparatus having the down-sampler and metadata generator and the at least one encoder wherein the one or more down-sampling grids are selected based on the motion information such that each of the high resolution pictures, when down-sampled using the one or more down-sampling grids, provide complementary pixel information for the post-decoding post-processing of the low resolution pictures as described above, wherein the grids are further selected based upon a distortion measure.
Additionally, another advantage/feature is the apparatus having the down-sampler and metadata generator and the at least one encoder wherein the high resolution pictures includes at least one reference picture and one or more non-reference pictures, and the down-sampler and metadata generator generates the low resolution pictures by estimating motion from a reference picture to each of the one or more non-reference pictures, selecting one or more down-sampling grids from a plurality of candidate down-sampling grids for use in down-sampling the high resolution pictures based on the motion information, and down-sampling the high resolution pictures using the one or more down-sampling grids as described above, wherein different ones of the plurality of down-sampling grids are used to down-sample different portions of a particular one of at least one of the high resolution pictures.
Moreover, another advantage/feature is the apparatus having the down-sampler and metadata generator and the at least one encoder wherein the high resolution pictures includes at least one reference picture and one or more non-reference pictures, and the down-sampler and metadata generator generates the low resolution pictures by estimating motion from a reference picture to each of the one or more non-reference pictures, selecting one or more down-sampling grids from a plurality of candidate down-sampling grids for use in down-sampling the high resolution pictures based on the motion information, and down-sampling the high resolution pictures using the one or more down-sampling grids as described above, wherein a respective binary mask is constructed for each of the high resolution pictures, the binary mask indicating respective locations of foreground pixels in the high resolution pictures.
These and other features and advantages of the present principles may be readily ascertained by one of ordinary skill in the pertinent art based on the teachings herein. It is to be understood that the teachings of the present principles may be implemented in various forms of hardware, software, firmware, special purpose processors, or combinations thereof.
Most preferably, the teachings of the present principles are implemented as a combination of hardware and software. Moreover, the software may be implemented as an application program tangibly embodied on a program storage unit. The application program may be uploaded to, and executed by, a machine comprising any suitable architecture. Preferably, the machine is implemented on a computer platform having hardware such as one or more central processing units (“CPU”), a random access memory (“RAM”), and input/output (“I/O”) interfaces. The computer platform may also include an operating system and microinstruction code. The various processes and functions described herein may be either part of the microinstruction code or part of the application program, or any combination thereof, which may be executed by a CPU. In addition, various other peripheral units may be connected to the computer platform such as an additional data storage unit and a printing unit.
It is to be further understood that, because some of the constituent system components and methods depicted in the accompanying drawings are preferably implemented in software, the actual connections between the system components or the process function blocks may differ depending upon the manner in which the present principles are programmed. Given the teachings herein, one of ordinary skill in the pertinent art will be able to contemplate these and similar implementations or configurations of the present principles.
Although the illustrative embodiments have been described herein with reference to the accompanying drawings, it is to be understood that the present principles is not limited to those precise embodiments, and that various changes and modifications may be effected therein by one of ordinary skill in the pertinent art without departing from the scope or spirit of the present principles. All such changes and modifications are intended to be included within the scope of the present principles as set forth in the appended claims.
This application claims the benefit, under 35 U.S.C. §365 of International Application PCT/US2011/000107 and filed Jan. 20, 2011, which was published in accordance with PCT Article 21(2) on Jul. 28, 2011, in English, and which claims the benefit of United States Provisional Patent Application Ser. No. 61/297,320, filed on Jan. 22, 2010, in English, which are incorporated by reference in their respective entireties.
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/US2011/000107 | 1/20/2011 | WO | 00 | 7/20/2012 |
Publishing Document | Publishing Date | Country | Kind |
---|---|---|---|
WO2011/090790 | 7/28/2011 | WO | A |
Number | Name | Date | Kind |
---|---|---|---|
845751 | Brenzinger | Mar 1907 | A |
5446806 | Ran et al. | Aug 1995 | A |
5537155 | O'Connell et al. | Jul 1996 | A |
5557684 | Wang et al. | Sep 1996 | A |
5754236 | Lee | May 1998 | A |
5764374 | Seroussi et al. | Jun 1998 | A |
5768434 | Ran | Jun 1998 | A |
5784491 | Koga | Jul 1998 | A |
5822465 | Normile et al. | Oct 1998 | A |
5862342 | Winter et al. | Jan 1999 | A |
6043838 | Chen | Mar 2000 | A |
6173089 | Van Lerberghe | Jan 2001 | B1 |
6278446 | Liou et al. | Aug 2001 | B1 |
6397166 | Leung et al. | May 2002 | B1 |
6526183 | Bonnet et al. | Feb 2003 | B1 |
6795578 | Kotani et al. | Sep 2004 | B1 |
6798834 | Murakami et al. | Sep 2004 | B1 |
7386049 | Garrido et al. | Jun 2008 | B2 |
7433526 | Apostolopoulos et al. | Oct 2008 | B2 |
7447337 | Zhang et al. | Nov 2008 | B2 |
7623706 | Maurer | Nov 2009 | B1 |
7643690 | Suzuki et al. | Jan 2010 | B2 |
7671894 | Yea et al. | Mar 2010 | B2 |
7715658 | Cho et al. | May 2010 | B2 |
8340463 | Cho et al. | Dec 2012 | B1 |
8831107 | Zheng et al. | Sep 2014 | B2 |
9031130 | Suzuki et al. | May 2015 | B2 |
20010055340 | Kim et al. | Dec 2001 | A1 |
20020009230 | Sun et al. | Jan 2002 | A1 |
20020036705 | Lee et al. | Mar 2002 | A1 |
20020172434 | Freeman et al. | Nov 2002 | A1 |
20030005258 | Modha et al. | Jan 2003 | A1 |
20030021343 | Trovato | Jan 2003 | A1 |
20030058943 | Zakhor et al. | Mar 2003 | A1 |
20040001705 | Soupliotis et al. | Jan 2004 | A1 |
20040017852 | Garrido et al. | Jan 2004 | A1 |
20040170330 | Fogg | Sep 2004 | A1 |
20040213345 | Holcomb et al. | Oct 2004 | A1 |
20040218834 | Bishop et al. | Nov 2004 | A1 |
20040258148 | Kerbiriou et al. | Dec 2004 | A1 |
20050015259 | Thumpudi et al. | Jan 2005 | A1 |
20050019000 | Lim et al. | Jan 2005 | A1 |
20050225553 | Chi | Oct 2005 | A1 |
20050243921 | Au et al. | Nov 2005 | A1 |
20060013303 | Nguyen et al. | Jan 2006 | A1 |
20060039617 | Makai et al. | Feb 2006 | A1 |
20060088191 | Zhang et al. | Apr 2006 | A1 |
20060126960 | Zhou et al. | Jun 2006 | A1 |
20060239345 | Taubman | Oct 2006 | A1 |
20060245502 | Cheng et al. | Nov 2006 | A1 |
20060269149 | Song | Nov 2006 | A1 |
20070014354 | Murakami et al. | Jan 2007 | A1 |
20070041663 | Cho et al. | Feb 2007 | A1 |
20070118376 | Mukerjee | May 2007 | A1 |
20070223808 | Kerr | Sep 2007 | A1 |
20070223825 | Ye et al. | Sep 2007 | A1 |
20070248272 | Sun et al. | Oct 2007 | A1 |
20080107346 | Zhang et al. | May 2008 | A1 |
20080117975 | Sasai et al. | May 2008 | A1 |
20080131000 | Tsai et al. | Jun 2008 | A1 |
20080152243 | Min et al. | Jun 2008 | A1 |
20080159401 | Lee et al. | Jul 2008 | A1 |
20080172379 | Uehara et al. | Jul 2008 | A1 |
20080187305 | Raskar et al. | Aug 2008 | A1 |
20090002379 | Baeza et al. | Jan 2009 | A1 |
20090003443 | Guo et al. | Jan 2009 | A1 |
20090041367 | Mansour | Feb 2009 | A1 |
20090080804 | Hamada et al. | Mar 2009 | A1 |
20090097564 | Chen et al. | Apr 2009 | A1 |
20090097756 | Kato | Apr 2009 | A1 |
20090116759 | Suzuki et al. | May 2009 | A1 |
20090175538 | Bronstein et al. | Jul 2009 | A1 |
20090180538 | Visharam et al. | Jul 2009 | A1 |
20090185747 | Segall et al. | Jul 2009 | A1 |
20090196350 | Xiong | Aug 2009 | A1 |
20090232215 | Park et al. | Sep 2009 | A1 |
20090245587 | Holcomb et al. | Oct 2009 | A1 |
20090252431 | Lu et al. | Oct 2009 | A1 |
20090274377 | Kweon et al. | Nov 2009 | A1 |
20100046845 | Wedi et al. | Feb 2010 | A1 |
20100054338 | Suzuki et al. | Mar 2010 | A1 |
20100074549 | Zhang et al. | Mar 2010 | A1 |
20100091846 | Suzuki et al. | Apr 2010 | A1 |
20100104184 | Bronstein et al. | Apr 2010 | A1 |
20100150394 | Bloom et al. | Jun 2010 | A1 |
20100196721 | Ogawa | Aug 2010 | A1 |
20100208814 | Xiong et al. | Aug 2010 | A1 |
20100272184 | Fishbain et al. | Oct 2010 | A1 |
20110007800 | Zheng et al. | Jan 2011 | A1 |
20110047163 | Chechik et al. | Feb 2011 | A1 |
20110142330 | Min et al. | Jun 2011 | A1 |
20110170615 | Vo et al. | Jul 2011 | A1 |
20110210960 | Touma et al. | Sep 2011 | A1 |
20110261886 | Suzuki et al. | Oct 2011 | A1 |
20120106862 | Sato | May 2012 | A1 |
20120155766 | Zhang et al. | Jun 2012 | A1 |
20120201475 | Carmel et al. | Aug 2012 | A1 |
20120320983 | Zheng et al. | Dec 2012 | A1 |
20130163676 | Zhang et al. | Jun 2013 | A1 |
20130163679 | Zhang et al. | Jun 2013 | A1 |
20130170558 | Zhang | Jul 2013 | A1 |
20130170746 | Zhang et al. | Jul 2013 | A1 |
20140036054 | Zouridakis | Feb 2014 | A1 |
20140056518 | Yano et al. | Feb 2014 | A1 |
Number | Date | Country |
---|---|---|
1128097 | Jul 1996 | CN |
1276946 | Dec 2000 | CN |
1495636 | May 2004 | CN |
1777287 | May 2006 | CN |
1863272 | Nov 2006 | CN |
101048799 | Oct 2007 | CN |
101389021 | Mar 2009 | CN |
101459842 | Jun 2009 | CN |
101551903 | Oct 2009 | CN |
101556690 | Oct 2009 | CN |
1401211 | Mar 2004 | EP |
1659532 | May 2006 | EP |
2941581 | Jul 2010 | FR |
3027670 | Feb 1991 | JP |
7231444 | Aug 1995 | JP |
H7-222145 | Aug 1995 | JP |
8502865 | Mar 1996 | JP |
H8-336134 | Dec 1996 | JP |
2000-215318 | Aug 2000 | JP |
2003018398 | Jan 2003 | JP |
2004222218 | Aug 2004 | JP |
2004266794 | Sep 2004 | JP |
200520761 | Jan 2005 | JP |
2006203744 | Aug 2006 | JP |
2006519533 | Aug 2006 | JP |
2008148119 | Jun 2008 | JP |
2008289005 | Nov 2008 | JP |
200977189 | Apr 2009 | JP |
2009239686 | Oct 2009 | JP |
2009267710 | Nov 2009 | JP |
2010514325 | Apr 2010 | JP |
2011501542 | Jan 2011 | JP |
2013528309 | Jul 2013 | JP |
0169662 | Oct 1998 | KR |
WO9406099 | Mar 1994 | WO |
WO9819450 | Jul 1998 | WO |
WO03084238 | Oct 2003 | WO |
WO03102868 | Dec 2003 | WO |
WO2005043882 | May 2005 | WO |
WO2006025339 | Mar 2006 | WO |
WO2007111966 | Oct 2007 | WO |
WO2008066025 | Jun 2008 | WO |
WO2009052742 | Apr 2009 | WO |
WO2009087641 | Jul 2009 | WO |
WO2009091080 | Jul 2009 | WO |
WO2009094036 | Jul 2009 | WO |
WO2009157904 | Dec 2009 | WO |
WO2010033151 | Mar 2010 | WO |
WO2011090798 | Jul 2011 | WO |
WO2011154127 | Dec 2011 | WO |
Entry |
---|
Smolic, A. et al., “Improved Video Coding Using Long-Term Global Motion Compensation”, Proceedings of SPIE, SPIE, USA, vol. 5308, No. 1, Jan. 22, 2004, pp. 343-354, XP008046986. |
Park, S.C. et al., “Super-Resolution Image Reconstruction: A Technical Overview”, IEEE Signal Processing Magazine, IEEE Service Center, Piscataway, NJ, US, vol. 20, No. 3, May 1, 2003, pp. 21-36, XP011097476. |
Segail, C. A. et al., “High-Resolution Images from Low-Resolution Compressed Video”, IEEE Signal Processing Magazine, IEEE Service Center, Piscataway, NJ, US, vol. 20, No. 3, May 1, 2003, pp. 37-48, XP011097477. |
Zhu, C. et al., “Video Coding With Spatio-Temporal Texture Synthesis”, IEEE International Conference on Multimedia and Expo (ICME), 2007, University of Science and Technology of China, Hefei, 230027, China, Microsoft Research Asia, Beijing, 100080, China, pp. 112-115. |
Zhu, C. et al., “Video Coding With Spatio-Temporal Texture Synthesis and Edge-Based Inpainting”, IEEE International Conference on Multimedia and Expo (ICME), 2008, University of Science and Technology of China, Hefei, 230027, China, Microsoft Research Asia, Beijing, 100080, China, pp. 813-816. |
Vo, D. T. et al., “Data Pruning-Based Compression Using High Order Edge-Directed Interpolation”, Thomson Research Technical Report, submitted to the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2009, Video Processing Laboratory, UC San Diego, CA 92092, Thomson Inc. Corporate Research, Princeton, NJ USA. |
Ben-Ezra, M. et al., “Video Super-Resolution Using Controlled Subpixel Detector Shifts”, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 27, No. 6, Jun. 2005, pp. 977-987. |
Ndjiki-Nya, P. et al., “A Generic and Automatic Content-Based Approach for Improved H.264/MPEG4-AVC Video Coding”, IEEE International Conference on Image Processing (ICIP), 2005, Image Processing Department, FhG Heinrich-Hertz-Institut (HHI), Berlin, Germany. |
Itu-T, H.264, “Series H: Audiovisual and Multimedia Systems, Infrastructure of audiovisual services—Coding of moving video”, “Advanced Video Coding for Generic Audiovisual Services”, ITU-T Recommendation H.264, Mar. 2005, 343 pages. |
Barreto, D. et al., “Region-Based Super-Resolution for Compression”, Multidimensional Systems and Signal Processing, Special Issue on papers presented at the I International Conference in Super Resolution (Hong Kong, 2006), vol. 18, No. 2-3, pp. 59-81, Sep. 2007. |
Sawhney, H. et al. “Hybrid Sereo Camera: An IBR Appoach or Synthesis of Very High Resoluton Stereoscopic Image Sequences”, Proc. SIGGRAPH, pp. 451-460, 2001, Vision Technologies Lab., Sarnoff Corp. |
Torr, P. et al., “MLESAC: A New Robus Estimator wth Applicaton to Esimatng Image Geometry”, Journal of Computer Vision and Image Understanding, vol. 78, No. 1, 2000, pp. 138-156. |
Fischler, M. et al., “Random Sample Consensus: A Paradigm for Model Fitting with Applications to Image Analysis and Automated Cartography”, Communications of the ACM, Jun. 1981, vol. 24, No. 6, pp. 381-395. |
Black, M. et al., “The Robust Estimation of Multiple Motions: Parametric and Piecewise-Smooth Flow Fields”, Computer Vision and Image Understanding, vol. 63, No. 1, 1996, pp. 75-104. |
Lowe, D., “Distinctive Image Features from Scale-Invariant Keypoints”, International Journal of Computer Vision, vol. 2, No. 60, 2004, pp. 91-110. |
PCT International Search Report Mailed: Apr. 20, 2011. |
Li et al., “Example-Based Image Super-Resolution with Class-Specific Predictors”, Journal of Visual Communication and Image Representation, vol. 20, No. 5, Jul. 1, 2009, pp. 312-322. |
Lee et al., “Robust Frame Synchronization for Low Signal-to-Noise Ratio Channels Using Energy-Corrected Differential Correlation”, EURASIP Journal on Wireless Communications and Networking, vol. 2009 (2009), Article ID 345989, online May 17, 2010, 8 pages. |
Cheng et al., “Reduced Resolution Residual Coding for H.264-based Compression System,” Proceedings of the 2006 IEEE Int'l. Symposium on Circuits and Systems (ISCAS 2006), May 21, 2006, pp. 3486-3489. |
Moffat et al., “Chapter 3. Static Codes,” Compression and Coding Algorithms, Feb. 2002, pp. 29-50. |
Zhang et al., “A Pattern-based Lossy Compression Scheme for Document Images,” Electronic Publishing, vol. 8, No. 2-3, Sep. 24, 1995, pp. 221-233. |
Bishop et al., “Super-resolution Enhancement of Video,” Proceedings of the 9th Int'l. Workshop on Artificial Intelligence and Statistics, Jan. 3, 2003, pp. 1-8, Society for Artificial Intelligence and Statistics, Key West, Florida. |
Bertalmio et al., “Image Inpainting”, Proceedings of SIGGRAPH 2000, New Orleans, USA, Jul. 2000, pp. 1-8. |
Bhagavathy et al., “A Data Pruning Approach for Video Compression Using Motion-Guided Down-Sampling and Super-Resollution”, submitted to ICIP 2010, pp. 1-4. |
Dorr et al., “Clustering Sequences by Overlap”, International Journal Data Mining and Bioinformatics, vol. 3, No. 3, 2009, pp. 260-279. |
Dumitras et al., “An Encoder-Decoder Texture Replacement Method with Application to Content-Based Movie Coding”, IEEE Transactions on Circuits and Systems for Video Technology, vol. 14, No. 6, Jun. 2004, pp. 825-840. |
Dumitras et al., “A Texture Replacement Method at the Encoder for Bit-Rate Reduction of Compressed Video”, IEEE Transactions on Circuits and Systems for Video Technology, vol. 13, No. 2, Feb. 2003, pp. 163-175. |
Freeman et al., “Example-based Super-Resolution”, IEEE Coomputer Graphics and Applications, Mar./Apr. 2002, pp. 56-65. |
Han et al., “Rank-based Image Transformation for Entropy Coding Efficiently”, Proceedings of the Fourth Annual ACIS International Conference on Computer and Information Science (ICIS'05), IEEE 2005. |
Symes, “Digital Video Compression,” McGraw-Hill, ISBN 0-07-142487, pp. 116-121 and 242-243. |
Komodakis et al., “Image Completion Using Efficient Belief Propagation Via Priority Scheduling and Dynamic Pruning”, IEEE Transactions on Image Processing, vol. 16, No. 11, Nov. 1, 2007, pp. 2649-2661. |
Krutz et al., Windowed Image Registration for Robust Mosaicing of Scenes with Large Background Occlusions, ICIP 2006, vol 1-7, IEEE, 2006, pp. 353-356. |
Liu et al., “Intra Prediction via Edge-Based Inpainting”, IEEE 2008 Data Compression Conference, Mar. 25-27, 2008, pp. 282-291. |
Porikli et al., “Compressed Domain Video Object Segmentation”, IEEE Transactions on Circuits and Systems for Video Technology, vol. 20, No. 1, Jan. 2010, pp. 1-14. |
Schuster et al., “An Optimal Polygonal Boundary Encoding Scheme in the Rate Distortion Sense”, IEEE Transactions on Image Processing, vol. 7, No. 1, Jan. 1998, pp. 13-26. |
Sermadevi et al., “Efficient Bit Allocation for Dependent Video Coding”, Proceedings of the Data Compression Conference (DCC'04), IEEE, 2004. |
Shen et al., “Optimal Pruning Quad-Tree Block-Based Binary Shape Coding”, IEEE Proceedings 2007, International Conference on Image Processing, ICIP, 2007, pp. V1-437-V1-440. |
Sun et al., “Classified Patch Learning for Spatially Scalable Video Coding”, Proceedings of the 16th IEEE International Conference on Image Processing, Nov. 7, 2009, pp. 2301-2304. |
Vu et al., “Efficient Pruning Schemes for Distance-Based Outlier Detection”, Springer Verlag, Proceedings European Conference 2009, pp. 160-175. |
Wiegand et al., “Overview of the H.264/AVC Video Coding Standard”, IEEE Transactions on Circuits and Systems for Video Technology, vol. 13, No. 7, Jul. 2003, pp. 560-576. |
Wu et al., Image Compression by Visual Pattern Vector Quantization (VPVQ), Proceedings of the 2008 Data Compression Conference, Mar. 25, 2008, pp. 123-131. |
Xiong et al., “Block-Based Image Compression with Parameter-Assistant Inpainting”, IEEE Transactions on Image Processing, vol. 19, No. 6, Jun. 2010, pp. 1651-1657. |
Xu et al., Probability Updating-based Adaptive Hybrid Coding (PUAHC), ISCAS 2006, IEEE 2006, pp. 361-364. |
Yap et al., “Unsupervised Texture Segmentation Using Dominant Image Modulations”, IEEE Conference Recordings of the 34th Asilomar Conference on Signals, Systems and Computers, IEEE 2000, pp. 911-915. |
Zhang et al., “Segmentation for Extended Target in Complex Backgrounds Based on Clustering and Fractal”, Optics and Precision Engineering, vol. 17, No. 7, Jul. 2009, pp. 1665-1671. |
Zheng et al., “Intra Prediction Using Template Matching with Adaptive Illumination Compensation”, ICIP 2008, IEEE 2008, pp. 125-128. |
Zhang et al., “Method and Apparatus for Data Pruning for Video Compression Using Example-Based Super-Resolution” Invention Disclosure, Apr. 2010. |
Zhang et al., “Example-Based Data Pruning for Improving Video Compression Efficiency”, Invention Disclosure, Apr. 2010. |
Zhang et al, “Video Decoding Using Blocked-Based Mixed-Resolution”, Invention Disclosure, Mar. 2010. |
Zhang et al., “Method and Apparatus for Data Pruning for Video Compression Using Example-Based Super-Resolution”, Invention Disclosure, Apr. 2010. |
Zhang et al, “Video Decoding Using Block-based Mixed-Resolution Data Pruning”, Invention Disclosure, Mar. 2010. |
International Search Report for Corresponding International Appln. PCT/US2011/050921 dated Jan. 4, 2012. |
International Search Report for Corresponding International Appln. PCT/US2011/050923 dated Jan. 5, 2012. |
International Search Report for Corresponding Appln. PCT/US2011/050925 dated Jan. 6, 2012. |
International Search Report for Corresponding Appln. PCT/US2011/050915 dated Jul. 30, 2012. |
International Search Report for Corresponding Appln. PCT/US2011/050922 dated Jan. 4, 2012. |
International Search Report for International Application PCT/US11/050924 dated Jan. 5, 2012. |
US Office Action for Related U.S. Appl. No. 13/821,078 Dated Jun. 5, 2015. |
US Office Action for Related U.S. Appl. No. 13/821,436 Dated Jun. 18, 2015. |
US Office Action for Related U.S. Appl. No. 13/821,130 Dated Jun. 16, 2015. |
US Office Action for Related U.S. Appl. No. 13/821,393 Dated Jul. 10, 2015. |
ISR for related International Application No. PCT/US2011/000107 dated Apr. 20, 2011. |
ISR for related International Application No. PCT/US2011/050917 dated Jan. 5, 2012. |
Non-Final Office Action for related U.S. Appl. No. 13/820,901 dated May 5, 2015. |
ISR for related International Application No. PCT/US2011/050913 dated Jul. 30, 2012. |
Non-Final Office Action for related U.S. Appl. No. 13/522,024 dated Mar. 27, 2015. |
ISR for related International Application No. PCT/US2011/000117 dated Apr. 29, 2011. |
ISR for related International Patent Application No. PCT/US2011/050918 dated Jan. 5, 2012. |
ISR for related International Application No. PCT/US2011/050920 dated Jan. 4, 2012. |
ISR for related International Application PCT/US2011/050919 dated Jan. 4, 2012. |
Non-Final US Office Action for related U.S. Appl. No. 13/821,357 dated Aug. 13, 2015. |
Non-Final Office Action for related U.S. Appl. No. 13/821,257 dated Aug. 19, 2015. |
Non-Final Office Action for related U.S. Appl. No. 13/821,283 dated Aug. 17, 2015. |
Non-Final Office Action for related U.S. Appl. No. 13/821,083 dated Jul. 16, 2015. |
Non-Final Office Action for related U.S. Appl. No. 13/821,270 dated Jul. 16, 2015. |
US Office Action for Related U.S. Appl. No. 13/821,436 dated Nov. 25, 2015. |
CN Search Report for Related CN Application No. 2011800432758 dated Sep. 23, 2015. |
CN Search Report for Related CN Application No. 201180006921.3 dated Nov. 21, 2014. |
CN Search Report for Related CN Application No. 2011800435953 dated Aug. 18, 2015. |
CN Search Report for Related CN Application No. 2011800153355 dated Nov. 22, 2014. |
CN Search Report for Related CN Application 2011800437234 dated Sep. 16, 2015. |
CN Search Report for Related CN Application 201180054419X dated Sep. 8, 2015. |
CN Search Report for Related CN Application 2011800432940 dated Jul. 28, 2015. |
CN Search Report for Related CN Application 201180053976.X dated Sep. 23, 2015. |
US Office Action for Related U.S. Appl. No. 13/820,901 dated Dec. 18, 2015. |
US Office Action for Related U.S. Appl. No. 13/821,257 dated Dec. 21, 2015. |
US Office Action for Related U.S. Appl. No. 13/821,130 dated Jan. 14, 2016. |
US Office Action for Related U.S. Appl. No. 13/821,357 dated Dec. 21, 2015. |
US Office Action for Related U.S. Appl. No. 13/821,393 dated Dec. 11, 2015. |
US Office Action for Related U.S. Appl. No. 13/821,078 dated Jan. 13, 2016. |
US Office Action for Related U.S. Appl. No. 13/821,283 dated Dec. 22, 2015. |
US Office Action for Related U.S. Appl. No. 13/821,083 dated Jan. 29, 2016. |
CN Search report for Related CN Application No. 201180054405.8 dated Nov. 30, 2015. |
US Notice of Allowance of Allowance for U.S. Appl. No. 13/522,024 dated Mar. 14, 2016. |
US Notice of Allowance of Allowance for U.S. Appl. No. 13/821,424 dated Mar. 14, 2016. |
Shimauchi, et al., “JPEG Based Image Compression Using Adaptive Multi Resolution Conversion,” The 17th Workshop On Circuits and Systems in Karuizawa. The Institute of Electronics, Information and Communication Engineers, pp. 147-152, Apr. 27, 2004. |
Shimauchi Kazuhiro, “JPEG Based Image Compression Using Adaptive Multi Resolution Conversion,” The 17th Workshop on Circuits and Systems in Karuizawa, The Institute of Electronics, Information and Communication Engineers, Apr. 27, 2004, pp. 147-152. |
Notice of Allowance for U.S. Appl. No. 13/821,393 Dated Mar. 18, 2016. |
US Non-Final Office Action for U.S. Appl. No. 13/821,130 Dated Jul. 11, 2016. |
US Non-Final Office Action for U.S. Appl. No. 13/821,436 Dated Jul. 11, 2016. |
US Non-Final Office Action for U.S. Appl. No. 13/820,901 Dated May 18, 2016. |
US Final Office for U.S. Appl. No. 13/821,270 Dated Feb. 26, 2016. |
Number | Date | Country | |
---|---|---|---|
20120294369 A1 | Nov 2012 | US |
Number | Date | Country | |
---|---|---|---|
61297320 | Jan 2010 | US |