Super resolution techniques generate higher resolution images to allow original, lower resolution image data to be displayed on higher resolution display systems. Super resolution techniques include multi-frame techniques. Multi-frame super resolution (MFSR) techniques fuse several low resolution (LR) frames. It can deliver rich details and makes the video more stable in the temporal domain. As a byproduct, MFSR can reduce noise to some extent.
However, if the number of low resolution frames is relatively low, such as less than or equal to 5 frames, the noise reduction performance is not good. This results in requiring other means of noise reduction. Typically, two different kinds of noise reduction may be used. One applies noise reduction for all of the input low resolution frames before doing the MFSR process. This increases the frame delay while the future low resolution frames have noise reduction applied to them.
An alternative approach applies noise reduction on the super resolution results of the MFSR process. This method requires more hardware resources because noise reduction operates on higher resolution frames. This also requires more memory and more processing because more data in involved.
The above noise reduction techniques may include motion adaptive or motion compensation processes because those are independent of the MFSR process and do not utilize any information from the super resolution process.
Image data of an original resolution is received at the image processing device, possible at an input buffer. The image data is at an original resolution, which may be referred to as low resolution because (LR) it is at a lower resolution than a resulting super resolution image. The image data may consist of the instant image data, as well as image data that has been stored or delayed to allow data from three different timing intervals to be used. The frame of image data undergoing processing will be referred to here as the current frame, CF, of image data. The image data in the most recent and next most recent image frames will be referred to as the previous image data, and the frames of previous image data may consist of one frame, P1 the most recent frame, or of two frames, P1 and P2, the next most recent frame. Similarly, the image data just before the current frame being processed will be referred to as the first future frame, F1 and the next most future frame, F2. In the following, suffix_LR is used to represent that the image data is at low resolution, superscript is used to represent that the image data has undergone noise reduction.
The image data then passes to a motion vector module 18 and a motion compensation based noise reduction module 20. The motion vectors for the current frame and at least one of the future frames and at least one of the previous frames also passes to the motion compensation noise reduction module 20. The super resolution module then receives the noise reduced current frame (CF_LR′), the motion vectors, the image data of noise reduced previous frames (P1_LR′, P2_LR′) and unprocessed future frames (F1_LR, F2_LR) to produce a current frame of super resolution data.
The super resolution motion vector calculations may use one or more methods including, but not limited to: a phase plane correlation (PPC) sub-block that calculates the motion vectors by performing phase plane correlation analysis using frames of low resolution image data, 3D recursive and optical flow. In the preferred embodiment a combination of 3D recursive and PPC motion vector calculation is used.
The motion compensated noise reduction module 20 then uses the motion vectors and the image data to produce a noise-reduced version of the current frame of low resolution data. This then undergoes an initial interpolation process at initial super resolution module 22, such as low angle interpolation, and scaling to generate a first estimate of the current frame of super resolution image data, CF_SR_Init. This estimate, the noise-reduced frame CF_LR′ of current image data and the four other frames (P2_LR′, P1_LR′, F1_LR, F2_LR) of low resolution image data feed into a super resolution core that then generates a current frame of super resolution data.
diff_alpha=blk_alpha_peak−blk_alpha_sad);
if diff_alpha>0, adj_blk_alpha=blk_alpha_sad+min(diff_alpha,Thr_Alpha_P);
If diff_alpha<=0, adj_blk_alpha=blk_alpha_sad+max(diff_alpha,−Thr_Alpha_N).
Here, Thr_Alpha_P and Thr_Alpha_N are thresholds to control the adjustment level. The pixel alpha at 30 and 40 is decomposed from these adjusted block alphas in 3×3 neighborhood of 4×4 blocks according to pixel positions in an 8×8 block. This 8×8 block results from extending a 4×4 block such as e as shown in
The pixel motion vector magnitude at 32 and 38 is used to evaluate whether a pixel has motion. It is decomposed from the magnitudes of the four block motion vectors, in one example using bilinear interpolation. All these block motion vectors come from a 3×3 neighborhood of 4×4 blocks. The central block is the block where the current pixel is. The magnitude of one block motion vector, blk_mv, can be found by blk_mv_mag=max(abs(blk_mv.x), abs(blk_mv.y)).
The block motion vector is at a sub-pixel accuracy level. Bilinear interpolation is therefore applied to get the pixel, pix_blk_mv, pointed to by the motion vector of the block including current pixel in CF_LR. The sample error, samp_err, is the minimal absolute difference among the interpolated pixel and the four integer pixel (p0, p1, p2, p3) around this sub-pixel. The interpolation pixel is calculated as pix_blk_mv=bilinear(p0, p1, p2, p3, dx, dy), where dx and dy are shown in the
Returning to
In
The high frequency calculation at 62 calculates the high frequency component in a 3×3 neighborhood of current pixel in CF_LR. As it is known, the difference between interpolated pixel pointed by block/0 motion vector and current pixel at CF_LR is often bigger in high details region than that in smooth region even if the motion vector is very close to being correct. So, it is necessary to refine this difference by subtracting the high frequency component multiplied by a gain HFQ_K. The refined difference reflects motion error from motion vector. The pixels in the 3×3 neighborhood are pix_i, i=0, 1, 2 . . . , 8, which are arranged from top left to bottom right. The pixel pix_4 is the current pixel in CF_LR. The calculation is as follows: h_hfq=max(abs(pix_4−pix_3), abs(pix_4−pix_5), abs(pix_3−pix_5)); v_hfq=max(abs(pix_4−pix_1), abs(pix_4−pix_7), abs(pix_1−pix_7)); raw_hfq_cf=(h_hfq+v_hfq)/2. The raw_hfq_cf is the raw high frequency component. The raw_hfq_cf is further refined by removing the noise as follows: hfq_cf=max(raw_hfq_cf−noise_level, 0). The hfq_cf is the output high frequency.
The sample error based motion adjustment at 66 first calculates the motion vector offset which is to evaluate how far the interpolated pixel pointed by the current motion vector is from its nearest four integer pixels. A graphical representation of this is shown in
The basic rule is that the bigger the sample error and MV offset are, the more the pixel motion is increased. That is to say, the motion vector is not valid for the current pixel because it will reduce details or cause other artifacts. Returning to
Also in
Finally, the min and low pass filtering (lpf) operations in
Returning to
Returning to
Returning to
Returning to
Returning to
For texture regions, it is better to maintain the fusion weight based only on blk_alpha_sad of future frames in order to fetch enough details from them. Fortunately, there is another motion vector confidence blk_alpha_peak which correlates well with texture regions and only has a large confidence when current block has some details. A new motion vector confidence for future frames is generated based on blk_alpha_sad and blk_apha_peak as follows: new_blk_alpha=blk_alpha_sad−max(blk_alpha_sad−blk_alpha_peak, 0)*Smth_Adj_K. Here, Smth_Adj_K is gain to adjust blk_alpha_sad, which is not more than 0.5. Then, fusion weight for future frames can be calculated based on the new_blk_alpha.
It is necessary to estimate noise level for noise reduction. Noise statistics are used and that the noise obeys normal distribution is assumed. The noise distribution is analyzed between CF_LR and P1_LR′, and CF_LR and F1_LR. In one embodiment, it is implemented by noise histogram with 64 bins at 100. Noise is highly related to brightness. Brightness distribution is shown by brightness histogram with 64 bins at 98.
The pixel brightness is the average between a pixel in CF_LR and its corresponding pixel in P1_LR′ which is pointed by block motion vector. If the block motion vector is reliable, that is the block alpha is big, the calculated pixel noise is effective. It is accumulated into the noise histogram as follows.
noise_hist_p1[min(pix_noise,63)]+=1.
If the block motion vector is reliable, the process accumulates the pixel brightness into the brightness histogram as follows.
Brgt_hist_p1[min(pix_brgt/4,63)]+=1.
Noise estimation is done by fusing forward and backward noise information. Because there are some occlusion regions, which are some regions that appear in the previous frame but disappear in future ones, or disappear in previous frame but appear in future ones, it is necessary to utilize two directions information as shown in
Based on this rule, the process finds the noise level from the noise histogram. The process defines noise_hist_sum the sum of all bins of noise histogram. From bin 0, accumulate the value of bins. Because the noise is defined as the absolute value of the difference between pixels in two frames, the distribution represents the sum of the right and left hand side of the normal distribution. When the accumulated value is about 0.68*noise_hist_sum, the bin index is the raw noise level. Because CF_LR and F1_LR are both noisy images, the estimated noise level between them are sqrt(2) time of true noise of CF_LR. The process then divides this noise level by sqrt(2).
As a normal distribution, shown in
If the noise actually obeys normal distribution and the maximal brightness is small, it is better to increase the noise level. The noise level is refined as: noise_level=noise_level_raw+min(max(Bright_Thr−max_bright, 0)*Bright_K, min(noise_level_raw, Noise_Adj_Thr)). Here, Bright_Thr is a threshold of maximal brightness, Bright_K is gain for maximal brightness, Noise_Adj_Thr is a threshold for noise adjustment based on maximal brightness, and noise_level_raw is the noise level calculated by noise distribution.
If the noise does not obey normal distribution as in curve 122, it should decrease the noise level in order to avoid wrong estimation. In the meanwhile, the process decreases the noise_hist_sum to get adj_noise_hist_sum, i.e., the count of pixels which contribute to noise estimation is decreased.
Because the noise is estimated in both forward and backward directions, it is necessary to fuse these two estimations. The backward noise level between CF_LR and P1_LR′ is noise_level_p1. The backward noise histogram accumulation sum is adj_noise_hist_sum_p1. The forward noise level between CF_LR and F1_LR is noise_level_f1. The forward noise histogram accumulation sum is adj_noise_hist_sum_f1.
The fused noise level is calculated as follows.
wgt=adj_noise_hist_sum_p1/(adj_noise_hist_sum_p1+adj_noise_hist_sum_f1);
tmp_noise_level=noise_level_f1+wgt*(noise_level_p1−noise_level_f1).
In order to keep noise estimation consistent and stable in temporal domain, it is necessary to smooth noise level between current noise level and previous noise level. The logic is as follows.
tpr_noise_level=(pre_noise_level+3*tmp_noise_level)/4;
noise_delta=tpr_noise_level−pre_noise_level;
if noise_delta>0, noise_level=pre_noise_level+min(noise_delta,Noise_Adj_Thr).
else, noise_level=pre_noise_level+noise_delta.
In this manner, the motion compensation process can be combined with the super resolution process to result in super resolution image data with reduced noise. This process does so with reduced hardware and reduced time.
It will be appreciated that several of the above-disclosed and other features and functions, or alternatives thereof, may be desirably combined into many other different systems or applications. Also that various presently unforeseen or unanticipated alternatives, modifications, variations, or improvements therein may be subsequently made by those skilled in the art which are also intended to be encompassed by the following claims.
This application is a continuation of and claims priority to U.S. patent application Ser. No. 14/107,975, filed Dec. 16, 2013, which is incorporated herein by reference in its entirety.
Number | Name | Date | Kind |
---|---|---|---|
5410553 | Choon | Apr 1995 | A |
5777682 | De Haan et al. | Jul 1998 | A |
5867219 | Kohiyama | Feb 1999 | A |
6307888 | Le Clerc | Oct 2001 | B1 |
8364471 | Yoon | Jan 2013 | B2 |
8447138 | Oryoji | May 2013 | B2 |
8862987 | Haussecker | Oct 2014 | B2 |
9691133 | Liu | Jun 2017 | B1 |
20030011709 | Kasahara et al. | Jan 2003 | A1 |
20030169820 | Babonneau | Sep 2003 | A1 |
20040179594 | Biswas et al. | Sep 2004 | A1 |
20040233326 | Yoo et al. | Nov 2004 | A1 |
20040263685 | Song | Dec 2004 | A1 |
20050025244 | Lee et al. | Feb 2005 | A1 |
20050219642 | Yachida | Oct 2005 | A1 |
20050281479 | Song | Dec 2005 | A1 |
20060215057 | Tanaka | Sep 2006 | A1 |
20070071104 | Kondo | Mar 2007 | A1 |
20070248166 | Chen | Oct 2007 | A1 |
20090092337 | Nagumo | Apr 2009 | A1 |
20100086220 | Minear | Apr 2010 | A1 |
20100114585 | Yoon | May 2010 | A1 |
20100183245 | Oryoji et al. | Jul 2010 | A1 |
20110058106 | Bruna Estrach et al. | Mar 2011 | A1 |
20110242417 | Saito et al. | Oct 2011 | A1 |
20110261264 | Zafarifar et al. | Oct 2011 | A1 |
20110274370 | Kondo et al. | Nov 2011 | A1 |
20120188373 | Kwon | Jul 2012 | A1 |
20120300122 | Liu | Nov 2012 | A1 |
20130141641 | Wu et al. | Jun 2013 | A1 |
20140254678 | Beric et al. | Sep 2014 | A1 |
20140286593 | Numata | Sep 2014 | A1 |
20140327820 | Iketani | Nov 2014 | A1 |
20150381870 | Weng | Dec 2015 | A1 |
Number | Date | Country | |
---|---|---|---|
Parent | 14107975 | Dec 2013 | US |
Child | 15632590 | US |