The invention relates to a method of using the low frequency information to generate the motion vectors for the block matching in motion estimation. The block matching processes only apply to the low frequency, but not to the whole frequency range, so as to reduce the computation complexity in digital-animation calculation.
As to the digital-animation processing on the screens of computer, TV, mobile phone and the like, technologies for digital-animation compression have been used to reduce the memory space or the transmission bandwidth. The digital-animation compression technology has multiple formats, including MPEG-2, MPEG-4, AVS and H.264, all these formats use “motion estimation” to compress data in temporal dimensions. Normally, a consecutive animation should be played 20-30 frames per second so as to keep the frames running smoothly, and the motion relationship between two frames is determined by motion estimation.
One of the motion estimation methods is to divide the frame into MBs (Macro-Blocks) of 16×16=256 pixels (or different sizes in variant protocols), and then to find out an optimal motion vector that is related to the previous frame for each of the MBs. With reference to
When calculating the motion vector of a certain MB in frame A, it must subtract the respective pixels of the MB in frame A by the corresponding pixels of a certain MB in frame B (full search method), and then add the 256 absolute differences together so as to get a “sum of absolute differences (SAD). In this case, many SADs are produced when calculating all the MBs in frame B, and the location of a comparative point corresponding to a minimum SAD is the target point. A location difference of the target point relative to the comparative point in frame A is the so-called “motion vector”. To reduce the calculation workload, initially a small searching range is defined and if the SAD found in the small searching range is less than a preset value (threshold value), then the location difference to the comparative point is the so-called motion vector.
Referring to
A frame includes 720×480 pixels, which can be divided into 1350 MBs. In this case, it totally needs 2.99×108 (1350×221663) operations to finish the motion vectors calculation of this frame. A consecutive animation is usually played at 22 frames per second; thereby the total operation rate is about 6.58×109 operations per second (22×2.99×108).
From the above description, we found the motion estimation needs huge computation power. The system should be equipped with high system clock and large DSP, accordingly the power consumption is high and the battery of portable electronic instruments is unable to support the load, and the cost is increased. Thus, many new solutions have been developed and which are divided into two categories: first, to reduce the number of the comparative points, second, to reduce the operations. Both approaches can be applied at the same time so as to reduce the calculation workload to the least.
Many solutions can be used to reduce the comparative points, including “three step search” (TSS), “four step search” (FSS), etc, which are used to find several points in a preset searching range and figure out the minimum MAD value, and then process a region calculation around the minimum MAD.
Solutions used to reduce the operations are relatively few. Inequality shown as below is one of them.
SUM(ABS(a−b))>=ABS(SUM(a)−SUM(b))
wherein “a” and “b” represent the pixel value of the respective points of two MBs. The meaning of this inequality is that the sum of absolute difference between the corresponding pixel value of two MBs (MAD calculation) is greater than or equal to the absolute difference between the respective sum of the pixel value of the two MBs (it is called rough calculation).
All of the above-mentioned methods are applied in the timing domain. However, after the time domain to frequency domain transformation, we found that the block matching algorithm can be further improved.
Most of the current video standards use different algorithms to compress data. Since human eyes are not so sensitive in the high frequency range as in the low frequency range, most of the video compression standards use DCT (Discrete Cosine Transfer) process to transform an image input from time domain to frequency domain; then formatting the data from dc, low frequency, to the high frequency; applying quantization to reduce the high frequency redundancies; using VLC (Variable Length Coding) to reduce the redundancies in the coding space; and finally using motion estimation to reduce the redundancies between pictures. Please refer to
Referring to
Referring to
Using existing blocks/algorithms, this invention changes the order of the processing sequence thereby achieving the reduction of the computation bandwidth.
The spirit and scope of the present invention depend only upon the following claims, and are not limited by the above embodiment.