1. Field of the Invention
The present invention relates to a method for compressing workload of digital-animation calculation, and more particularly to a method that can calculate by dividing the frame of the digital-animation into small blocks less than 16 x 16 pixels, and RAM is used to temporarily save the calculation results, and the calculation results can be used repeatedly, so as to reduce the workload of digital-animation calculation.
2. Description of the Prior Arts
As to the digital-animation processing on the screens of computer, TV, mobile phone and the like, technologies for digital-animation compression have been used to reduce the memory space or the transmission bandwidth. The digital-animation compression technology has multiple formats, including MPEG-2, MPEG-4, AVS and H.264, all these formats use “motion estimation” to compress data. Normally, a consecutive animation should be played 20-30 frames per second so as to keep the frames running smoothly and easily, and the motion relationship between two consecutive frames must be determined by motion estimation.
One of the motion estimation methods is to divide the frame into MBs (Macro-block) of 16×16=256 pixels, and then to find out an optimal motion vector that is related to the previous frame for each of the MBs. With reference to
When calculating the motion vector of a certain MB in frame A, it must subtract the respective pixels of the certain MB in frame A by the corresponding pixels of a certain MB in frame B (full search), and then add the 256 absolute differences together so as to get a “sum of absolute differences (SAD). In this case, many SADs are produced when calculating all the MBs in frame B, and the location of a comparative point corresponding to a minimum SAD is the target point. A location difference of the target point relative to the comparative point in frame A is the so-called “motion vector”. To reduce the calculation workload, initially a small searching range is defined and if the SAD found in the small searching range is less than a preset value, then the location difference to the comparative point is the so-called motion vector.
Referring to
If a frame has 720×480 pixels, which can be divided into 1350 NMBs, the respective MBs are closely adjacent to each other without overlap, the searching ranges of the respective MBs are overlapped. However, each of the respective MBs needs to be re-calculated. In this case, it totally needs 2.99×108 (1350×221663) operations to finish the motion vectors calculation of this frame. A consecutive animation is usually played at 22 frames/second, thereby the total operation rate is about 6.58×109 operations/second (22×2.99×108).
Thereby, the full-search calculation is too complicated, and the system should be equipped with high system clock and large DSP, accordingly the power consumption is high and the battery of portable electronic instruments is unable to support the load, and the cost is increased. Thus, many new solutions have been developed and which are divided into two categories: first, to reduce the number of the comparative points, second, to reduce the operations. Both solutions can be used at the same time so as to reduce the calculation workload to the least.
Many solutions can be used to reduce the comparative points, including “three-step search” (TSS), “four step search” (FSS), etc, which are used to find several points in a preset searching range and figure out the minimum MAD value, and then process a region calculation around the minimum MAD.
Solutions used to reduce the operations are relatively few. Inequality shown as below is one of them.
SUM(ABS(a−b))>=ABS(SUM(a)−SUM(b))
Wherein a and b represent the pixel value of the respective points of two MBs. The meaning of this inequality is that the sum of absolute difference between the corresponding pixel value of two MBs (MAD calculation) is greater than or equal to the absolute difference between the respective sum of the pixel value of the two MBs (it is called rough calculation).
By taking advantage of the characteristic of this inequality, we can take an arbitrary point in the searching range as a first comparative point and perform a MAD calculation (that is the left side calculation of the above-mentioned inequality), this MAD value is taken as a “temporary minimum reference value”, then choose a second point to perform a calculation of the right side of the inequality (rough calculation). If the temporary minimum reference value is the real minimum value in the searching range, the MAD value of the second point should be greater than the temporary minimum reference value. However, if the rough calculation value of the second point is already greater than the temporary minimum reference value, according to the inequality, since the MAD value of the second point is greater than or equal to the rough calculation value of the second point, then it must be greater than the temporary minimum reference value, thereby, the temporary minimum reference value can be retained. If the rough calculation value is minor than the temporary minimum reference value, it is uncertain that the MAD value of the second point is minor than the temporary minimum reference value, in this case, the MAD calculation of the second point must be performed (the calculation at the left side of the above-mentioned inequality) and then to be compared with the temporary minimum reference value. If the MAD value of the second point is truly minor than the temporary minimum reference value, the MAD value of the second point will be taken as a new temporary minimum reference value.
Repeat the above-mentioned procedure until the comparisons of the 289 points in the searching range are finished, at each time of comparison the temporary minimum reference value will be registered in memory.
Referring to
The above-mentioned inequality can substantially reduce the calculation workload, however, we found it can be further improved.
The present invention has arisen to mitigate and/or obviate the afore-described disadvantages of the conventional calculation method for compressing workload of digital-animation calculation.
The primary object of the present invention is to provide a calculation method for compressing workload of digital-animation calculation, which is used to divide the frame of digital-animation into small blocks whose size is less than 16×16 pixels, the sum of pixel value of the each small block is calculated respectively and stored in memory, and by taking advantage of the inequality that the sum of absolute difference between the corresponding pixel value of two MBs (MAD calculation) is greater than or equal to the absolute difference between the respective sum of the pixel value of the two MBs (rough calculation), the present invention is to figure out a MAD value of an arbitrary point in the searching range of MB, the MAD value is taken as a temporary minimum reference value and registered in memory, and then to find out the rough calculation values of the rest points in the searching range according to a small block per unit. If the rough calculation value is greater than or equal to the temporary minimum reference value, the temporary minimum reference value can be retained, otherwise the MAD value of the rest points should be calculated, if the MAD value of the rest points is greater than or equal to the temporary minimum reference value, the temporary minimum reference value will be retained, otherwise, the temporary minimum reference value will be replaced by the MAD value of other point.
The present invention will become more obvious from the following description when taken in connection with the accompanying drawings, which shows, for purpose of illustrations only, the preferred embodiments in accordance with the present invention.
Referring to
According to the full search of motion estimation, if searching range is 32×32 pixels and the size of MB is 16×16 pixels, it needs 221,663 operations to find out the motion vector for each MB. And it needs 75,326 operations by using the above-mentioned inequality method.
The calculation method in accordance with the present invention is shown in
Suppose that the first comparative point P1,1 at the upper left corner corresponds to a minimum value. It must use the MAD method when matching the first point with itself, so as to find out a “temporary minimum reference value” in this searching range, which needs 767 operations (16×16=256 subtractions, get the absolute value of 256 operations, 255 summations, 767=256+256+255, same as the above-mentioned full search method).
Comparisons between the point P1,1 and the respective points are performed based on the rough calculation at the right side of the above-mentioned inequality. The rough calculation is made according to a small block of 2×2 pixels per unit, and each small-block has 4 pixels, firstly it needs 3 operations to add the values of the 4 pixels together and the calculation results of each small-block are temporally stored in the Data Memory (RAM) of the DSP/ALU in
Since the performances of load and store of the memory access are parallel processed with general operation instructions, it is temporally omitted from the following calculations.
The first comparative point P1,1 takes about 255 operations (summations 3×64+63) to get the sum of the 64 small-blocks of its own, and the second comparative point P1,2 also needs 255 operations (3 summations×64+63) to get the sum of the 64 small-blocks of its own. However, the 3rd, the 4th . . . the 17th comparative points P1,3˜P1,17 in the first row, each of which only needs 87 operations (3×8+63) because only 8 new small-blocks need to be re-calculated and the values of the rest 56 small-blocks have been stored in memory during the calculation of the point P1,1. The operations for calculating the sum of the comparative points (P2,1˜P2,17) in the second row are same as that of the first row (as shown in
The precise calculation (MAD) for a comparative point is performed only when the result of the rough calculation is minor than the “temporary minimum reference value”. If the result of the MAD is minor than the “temporary minimum reference value”, it will substitute the “temporary minimum reference value” and stored in memory. If the result of the rough calculation is greater than the “temporary minimum reference value”, obviously, this comparative point is not the target, and then the rough calculation for the next comparative point is performed. Repeat these procedures until all the calculations for the 289 comparative points have been done. (the possibly necessary MAD calculations have been omitted from the above calculations since the value of the first comparative point is supposed to be the optimum value, however, some methods have been found in real operation which can be used to effectively find the first comparative point, namely the optimum value, however, it will not be discussed in this present invention).
To summarize the above-mentioned methods, if searching range is 32×32, MB is 16×16, calculation workload will be 22,721 operations, wherein:
If a frame has 720×480 pixels, which can be divided into 1350 MBs, the respective MBs are adjacent to each other without overlap. However, the size of the MBs in the searching range of 32×32 is 16×16, there are a great of the searching range of the respective MBs and that of the neighboring MBs are overlapped, in this case, the calculation result of the small-blocks can be repeatedly used on the respective MBs. To finish the motion estimation of a frame, the total calculation workload is less than 3.07×107 operations (1,350×22,721). If the running rate is at 22 frames per second, the calculation workload is less than 6.75×108 operations (3.07×107×22) per second. Thereby, the total calculation workload in accordance of the present invention is only 30.2% that of the inequality.
According to the specifications of the MPEG-2, the MPEG-4, the AVS and the H.264, all the MBs are closely adjacent to each other, therefore, the searching ranges of the respective MBs are overlapped. Use this feature wisely, when the resolution is increased, only the calculation workload for the top edge and the leftmost edge of a frame is relatively heavy, while each of the rest MBs only needs about 20,000 operations. Thereby, the calculation method in accordance with the present invention is capable of further reducing the calculation workload.
While we have shown and described various embodiments in accordance with the present invention, it should be clear to those skilled in the art that further embodiments may be made without departing from the scope of the present invention.