The field of the invention relates to video special effects and more particularly to methods of using polynomial functions to create video special effects.
Video special effects of changing from a source video image to an output video image are generally known. Examples that are generally known include cutting out a portion of video inside a soft-edge heart shape, adding star highlighting into the video, adding words with changing gradient color on the character faces of the word, and rotating, shearing, and resizing the words, the star highlight and the soft-edge heart shape with its video insert.
In cutting out the portion of video inside a soft-edge heart shape, the pixels of the first image located inside the heart shape is identified and copied into the output video image. Those pixels of the first image located near the edge of the heart shape are identified and colors of these pixels are modified based on the value of a polynomial function evaluated at the pixel locations. Most pixels inside the heart shape are opaque. However, in heart shape cutouts, the closer a pixel is to the edge, the higher its transparency. This gives the heart shaped cutout a soft-edge border.
In adding star highlighting to the output video image, the color of the pixels located in the area of the highlighting are mixed with white color. The mixing ratio of a particular pixel is decided based on the value of a polynomial function evaluated at the pixel location. Near the center of the highlight, the mixing ratio is very high, so, a maximum amount of white are used in the color mixing process. Near the edge of the highlight, a lower mixing ratio is used to give a slow fading of the highlight.
In adding words with gradient color on the character faces of the word to the output video image, the color of a particular pixel located inside the face of the characters of the word is decided based on the value of a polynomial function evaluated at the pixel location. The gradient color on the character faces may change as the parameters of the polynomial function changes.
After the above special effects, the video image resulting from the special effects may be further processed to add a 3D look.
While known methods of rendering video images perform adequately, they are generally computationally intensive. Some rendering techniques perform complex calculations to obtain needed high quality video images at the expense of high power processor or they completely avoid the complex calculations at the expense of providing a lesser quality image or a lesser capable special effect system. Other rendering techniques render video images slowly as a separate rendering step before outputting the video special effect. Because of the importance of video processing, a need exists for a method of rendering video image special effects that is high quality and less complex.
The main object of this invention is to provide a method and a device that has a more efficient way to calculate polynomial functions by incremental evaluation, i.e. via an addition operation only.
Another object of this invention is to provide a method and a device that has a more efficient way to calculate multiple polynomial functions each in its own bounding box. This reduces computational requirements and reduces hardware needed to evaluate many polynomial functions if the bounding boxes of these polynomials do not overlap. This is an improvement in efficiency when comparing to a device that evaluates each polynomial over the entire image, and then combine them afterwards.
Another object of this invention is to provide a method and a device that give an extra level of flexibility in adding a 3-D look to polynomial-based video special effect. This is done by modifying the polynomial function itself. This modification process changes the initialization data of a polynomial function before it's rendering begins. As a result, video special effects add a 3-D look without any extra processing during rendering, and need no additional rendering hardware for it.
Another object of this invention is to provide a method and a device that evaluate different polynomials in different regions in an image. The different regions may be separated by dividing lines. Similar to the bounding box approach, this also reduces computation requirement and reduces hardware needed to perform many video special effects. Therefore, this is also an improvement in efficiency when comparing to a device that evaluates each polynomial over the entire image, and then combine them afterwards. However, this method handles multiple regions where their bounding boxes do overlap.
Another object of this invention is to provide a more powerful and flexible device to evaluate multiple higher order polynomials. Instead of using dedicated hardware for each polynomial function, it uses one higher speed higher order polynomial engine and operates the polynomial engine sequentially to evaluate different polynomials.
Another object of this invention is to have a more efficient way to store polynomial parameters. The state memory stores the following uniformly:
Another object of this invention is to provide a method and a device that has a higher system throughput in evaluating polynomial functions. The invention uses multiple polynomial engines in a pipelined arrangement.
A method and apparatus are provided for rendering a video image to a destination image space from a plurality of source image spaces. The method includes the steps of generating a set of intermediate incremental values from one or more polynomials, incrementally evaluating the polynomials within a loop processing area of a source or a destination image space along an inner and an outer processing loop based upon the generated set of intermediate incremental values. During the rendering a video image, the method retrieves pixels from a source image space, creates color pattern, or blends the pixels from two source image spaces based upon the evaluated polynomials from the inner and outer loops.
Selection of the type of special effect to be rendered may be accomplished through a man-machine interface (MMI) (e.g., a keyboard and monitor) 20.
In order to render an image, an operator (not shown) may select a video special effect from a list of available effects. Once selected, the video special effect defines several two-variable polynomial functions over the 2-dimensional image space with one variable being the column position, let's call it X, and another variable being the row position of a pixel, call it Y. Each polynomial function has a value for each pixel on the image. One way to visualize a polynomial function is to picture them as a family of curves (or lines), one polynomial value per curve, and the family of curves forms a surface across the x-y plane. The evaluation of the polynomial functions is performed by polynomial processor, 26.
The video special effect also defines relationships between the polynomial functions and attributes of the output video. The attributes of the output video includes, but not limited to, texture mapping information, background color information, and color processing information (together referred to as special effect controlling information). These special effect controlling information are copied to registers, 46, and a memory, 32, inside the controller, 12. In addition, this information is sent to the polynomial processor, 26, on a timely basis.
“Texture mapping information” specifies how the polynomial functions relate pixel coordinates between video source and video output. Texture mapping function is performed by the Source Video Controller, 28. “Background color information” specifies how to generate background video source information, such as gradient color by polynomial functions. The background color generation function is performed via a Color Processor color input, 34. “Color processing information” specifies how to generate output video color from multiple source video controlled by polynomial functions (e.g. blending colors from two source video). Color processing function is controlled via a Color Processor control input, 36.
In one example, four second-order polynomial functions and four first-order polynomial functions can be combined to create a rotated soft edged heart shape. In this example, the polynomial functions divide the output video image space into three distinct areas, in which the first area, inside the heart shape, is for video from a first video source, and the second area, outside of the heart shape, is for video from the second video source, and the third area, near the edge of the heart shape, is for video from both video sources. In the first area, the polynomial functions define texture mapping information even if the heart shape is rotated. In the third area, the polynomial functions also define the blending ratio used to combine the two video sources to produce the output video image.
In another example, two second-order polynomial functions can be combined to create a spot highlight effect. In this example, the larger value of the two polynomial functions is used to control the blending with the white color. The higher the value of the polynomial function, the more white is added to the color of a source pixel. The lower the value of the polynomial function, the less white is added to the color of a source pixel.
In another example, one second-order polynomial function can be used to create a radial shape color gradient as the background video source for the output video.
Inner loop intermediate values are Δu, ΔuΔu, ΔuΔuΔu etc, and are referenced by numbers 202, 212, 222, 232 for first, second, third, and fourth order polynomial functions in
An arrow from A to B with a “+” sign indicates an incremental value A is added to an intermediate value B. The value B keeps the sum. An arrow from A to B with a “=” sign indicates value A is copied to value B.
At the beginning of the parallelogram shape loop processing sequence, 401 in
All intermediate values form a triangle shape data flow diagram. As the order of the polynomial function increases, the number of intermediate value increases, and the size of the triangle shaped data flow diagram increases. The arrangement of each intermediate value in the diagram is the same. Higher order polynomial functions can be evaluated by extending the inner loop, outer loop add, and outer loop transfer operations in their triangle shaped data flow diagrams.
Inner loop operations are performed at every output pixel. Outer loop operations are performed before the beginning of every output scan line. To perform polynomial evaluation at the video frame rate, the inner loop needs to be processed frequently and fast. The outer loop computation does not have the same constrain, and is performed during the time between the end of one scan line and the beginning of the next scan line of the output video.
Some intermediate values are constant values, and some intermediate values are incrementally computed from other intermediate values. For example, for third-order, ΔΔΔ intermediate values are all constant, ΔΔ intermediate values are first-order polynomial functions, and Δ intermediate values are second-order polynomial functions.
If a loop processing area in the source video is identical to the rectangle output video as shown in
If a loop processing area in the source video is a parallelogram shape as shown in
These equations are used to compute the initialization values before the rendering. In addition, initialization values only need to be computed once for each output video image.
For video special effect applications, output pixels produced for the output video are always in line order, which means one horizontal scan line at a time. For every output pixel in the output video, there is a corresponding sample point in the parallelogram shaped loop processing area in the source video. As the output video finishes one scan line, the parallelogram loop processing area finishes one inner loop. When the output video finishes the last line of the video frame, the parallelogram loop processing area finishes the last inner loop. In this way, every output pixel has a polynomial value evaluated at the coordinate of the corresponding sample point in the parallelogram loop processing area inside the source video.
Most video special effects use several polynomial functions to control rendering attributes including texture mapping. The incremental evaluations for all polynomial functions are performed synchronously throughout the inner loops and outer loop. Source video may be scanned by parallelogram shaped scanning sequences consisting of outer loop and inner loops, as illustrated in
In one type of video special effect as illustrated in
As a result, the values of polynomial F at the four edges of the source video, Qa1, Qs2, Qs4, Qs3, 608 in
There is a one-to-one relationship between inner loops inside the virtual parallelogram area and scan lines inside the output video. And the values of polynomial function F across an inner loop of the virtual parallelogram are used as the values of polynomial function G across a scan line of the output video.
The parallelogram shape loop processing area (i.e. virtual parallelogram area) for the bounding box is Ps1, Ps2, Ps4, Ps3, 702, in
Due to this one-to-one relationship, the four corners of the virtual parallelogram correspond to the four corners of the bounding box. The four edges of the virtual parallelogram correspond to the four edges of the bounding box.
The edge, 717, of the virtual parallelogram which corresponds to the top edge of the bounding box is the “inner loop” of the Loop Processing Area. Every pixel on the top edge of the bounding box in the output video has a corresponding sample point in the inner loop in the source video. The edge, 721, of the virtual parallelogram which corresponds to the left edge of the rectangle bounding box is the “outer loop” of the Loop Processing Area. Every scan line passing through the bounding box has a corresponding sample point in the outer loop. Each sample point in this outer loop is the starting point of an inner loop.
In
As the output video is rendered across its screen area, 740, the output video's pixel coordinate must be used to determine if this pixel is inside the bounding box 744. If the output pixel is outside the bounding box 744, the polynomial is not evaluated. Otherwise, the polynomial is evaluated.
The parallelogram warping method allows all polynomial function-based video special effects to be warped into parallelogram shape in the output video space. For example, the heart shape briefly described earlier is created using four second-order polynomial functions. By applying parallelogram warping, it can be animated as a flying heart shape that may flip and turn, just as a piece of paper cut into a heart shape, if a user were to release it and let it fly in the wind.
In the first step, 902, user selects a video special effect. The selected video special effect identifies two areas of interest: a parallelogram area, 750, in
In
From Qd1, Qd2, Qd3, Qd4, we also can find the rectangle bounding box Pd1, Pd2, Pd3, Pd4, 744, that covers the parallelogram area, 750. The is done by taking the minimum and maximum of the x coordinates of the 4 points Qd1, Qd2, Qd3, Qd4 and the minimum and maximum of the y coordinates of the 4 points Qd1, Qd2, Qd3, Qd4. This step is 908 in
The next step, 910, is to find the corner points Ps1, Ps2, Ps3, Ps4, of the parallelogram shape loop processing area (i.e. virtual parallelogram), 702 in
This step requires to define mathematic formula for a texture mapping between the two areas identified in the first step. One way to create a texture mapping between any rectangle area and a parallelogram area is using distances. This is illustrated in
From each of the four corner points of the bounding box Pd1, Pd2, Pd3, Pd4, we compute the two distances to the two lines L1, L2, by using “distance between point and line” formula. The signed distance, d, between a point at (x,y) and a line pass through (Xq, Yq) with angle θ is
d=(X−Xq)sin θ−(Y−Yq)cos θ
One side of the line will have positive distance, and the other side, negative distance. If reversed sign is desired, then,
d=−(X−Xq)sin θ+(Y−Yq)cos θ.
Source coordinates of Ps1, Ps2, Ps3 and Ps4 are calculated based on these distance equations as listed in
The resulting 4 points form another parallelogram Ps1, Ps2, Ps3, Ps4, 702.
In the next step, 912, we can compute the inner loop scanning vector: u, 704, and the outer loop scanning vector: v, 706, from Ps1, Ps2, Ps3, Ps4. The equations for u and v are listed in
In adding star highlighting to the output video image, the color of the pixels located in the area of the highlight are mixed with white color. The mixing ratio of a particular pixel is decided based on the value of a polynomial function evaluated at the pixel location. Near the center of the highlight, the mixing ratio is very high, so, maximum amount of white are used in the color mixing process. Near the edge of the highlight, a lower mixing ratio is used to give a slow fading of the highlight.
The result 1007 is produced without parallelogram warping. If we apply parallelogram warping to the hyperbola 1005, 1006 the four-arm highlight 1007 can be rotated, sheared, and resized. Such result may be obtained by simply changing the initialization value of the polynomial function and requires no extra processing during rendering of the output video.
When we compute a polynomial function for every pixel of a full screen video frame, incremental polynomial evaluation hardware may be used for the entire video frame. If polynomial functions A and B need to be evaluated for different and non-overlapping portions of the screen area, it is more efficient to use the same incremental polynomial evaluation hardware to evaluate them in their own screen area. Lets call this method of sharing polynomial evaluation hardware for non-overlapping screen areas the “multi-polynomial evaluation method”.
Many video special effects have different polynomial functions covering different screen areas, and their screen areas do not overlap. The bounding box-based approach illustrated in
If a video special effect is symmetrical across the vertical mirror line illustrated in
However, in case of an arbitrary parallelogram scanning order, this increment-decrement technique will not work, as illustrated in
In
Under the illustrated embodiment of the invention,
For every inner loop, there is a sample point called a “switch point”. A switch point defined by the first sample point evaluates polynomial B, so it is the start of inner loop for polynomial B. For each inner loop, all sample points on the left side of the switch point evaluate polynomial A, all sample points on the right side of the switch point evaluate polynomial B.
Inner loop operations of polynomial function A starts to evaluate at the beginning of the inner loop, and follows the algorithm in
However, their outer loop calculations are different. Outer loop operation of polynomial function A follows the algorithm in
In general, when the outer loop of polynomial B follows an arbitrary dividing line, 1305, two alternative updates of outer loops exist depending on which one of the two alternative is closer to the line 1305. Let's call the two alternative coordinate changes of the inner loop starting point the v1 vector and the v2 vector. The v1 vector and v2 vector are the sum of a single v vector and integer multiples of the u vector. The integer multiples of the u vector for v1 and v2 are always different by one. Before an outer loop operation, one must decide which coordinate change to use during the outer loop operation.
Let's call Switch Count the sequence number of the switch point within the particular inner loop.
For the next inner loop, 1409, the switch point can be one of the two alternative points P1, 1406, or P2, 1407. Vector v1, 1403, represents the coordinate change between two switch points P and P1. Vector V2, 1404, represents the coordinate change between two switch points P and P2.
In this example,
v1=v+2u
v2=v+3u
Lets refer to the two alternative coordinate changes of the switch points as direction dir1 and direction dir2 respectively. The Switch Count of the sample point P1 is the sum of the current Switch Count N and ΔSWC1. The Switch Count of sample point P2 is the sum of the current Switch Count N and ΔSWC2. The decision of which switch point to use depends on the distance, d1, 1421, between the sample point P1 and the dividing line, and the distance, d2, 1422, between sample point P2 to the dividing line. The switch point closer to the dividing line (i.e. having the smaller distance) is the next switch point.
The computation of d1 and d2 for the next switch point of the next inner loop is done incrementally from the current switch point of the current inner loop. Δd1 and Δd2 are the incremental values for d1 and d2 respectively.
If P1 is closer to the dividing line 1410, outer loop intermediate values associated with vector V1 should be used for outer loop operation. Otherwise, outer loop intermediate value associate with vector V2 should be used.
Inner loops between 1504 and 1510 evaluates polynomial A for all samples from the start of the inner loop up to the dividing line, and evaluates polynomial B for the rest of the inner loop. Let's call these inner loops the “multi-polynomial area”, 1508.
Computing both A and B requires a u vector and a v vector for polynomial A and the same u vector and three v vectors for polynomial B. The three v vectors are v, v1, 1403, v2, 1404, and each has their own outer loop intermediate value computation.
For polynomial B inside the “multi-polynomial area”, all outer loops need to determine the distances between the two alternative switch points to the dividing line. Furthermore, due to the two alternative outer loop vectors, additional intermediate values are computed based on additional constant values.
For polynomial function A, its evaluation in
For polynomial function B outside the “multi-polynomial area”, its evaluation in
To evaluate polynomial B inside the “multi-polynomial area”, its inner loop is computed by 1604 based on the inner loop vector u. Due to the two alternative outer loop vectors v1, v2 inside the “multi-polynomial area”, it is necessary to compute additional outer loop intermediate values. If the next switch point follows direction dir1, its “outer loop add” is computed by 1605 based on outer loop vector v1. In this case, two Δv are computed: Δv1 and Δv2, are incremented by “Δv1Δv2” and “Δv2Δv2” respectively. In addition, OldΔu is incremented by “ΔuΔv1”.
If the next switch point follows direction dir2, its “outer loop add” is computed by 1606 based on outer loop vector v2. In this case, two Δv are also computed: Δv1 and Δv2, are incremented by “Δv1Δv2” and “Δv2Δv2” respectively. OldΔu is incremented by “ΔuΔv2”. “Δv1Δv1”, “Δv2Δv1”, “Δv1Δv2”, “Δv2Δv2”, “ΔuΔv1”, and “ΔuΔv2” are additional constant values needed for polynomial B inside the “multi-polynomial area”. The inner loop starting values are initialized by outer loop transfer, 1608.
The incremental computation used to determine outer loop direction is illustrated in 1609 in
To meet the output video frame rate, all inner loop operations, 1601, 1604, are performed very rapidly using the algorithm in
In
In
In
If test 1722 is true, polynomial A and B are both evaluated for different parts of the inner loop. The method illustrated in
If the direction of the dividing line is closer to dir1, polynomial B performs outer loop operations 1734, and it is listed in 1605 and 1608. If the direction of the dividing line is closer to dir2, polynomial B performs outer loop operations, 1736, and it is listed in 1606, 1608.
In cutting out the portion of video inside a soft-edge heart shape, the pixels of the source video located inside the heart shape is identified and colors of these pixels are modified based on the value of a polynomial function evaluated at the pixel locations. Most pixels inside the heart shape are opaque. However, in heart shape cutouts, the closer a pixel is to the edge, the higher its transparency. This gives the heart shaped cutout a soft-edge border.
Since the heart shape may be rotated by parallelogram warping its mirror (reflecting polynomial function pairs such as the two-ellipses pair, or the two-parabolic cylinders pair) should be rendered via the multi-polynomial evaluation method. In both cases, the mirror line is the dividing line separating the screen into two areas each having a different polynomial function. As a result, the two ellipses share the same polynomial computation hardware for the output video. In addition, the two parabolic cylinders share another polynomial computation hardware for the output video. Without the multi-polynomial evaluation method, each ellipse or parabolic cylinder needs its own polynomial computation hardware for the output video.
The method to evaluate the outer loops in the “multi-polynomial” area defined in
In
In
Video special effects usually involve several polynomial functions such as the 4-arm highlight example in
In addition, under the illustrated embodiment of the invention, the polynomial processing system uses “state memory” to store polynomial intermediate values, or state of computation. The state memory stores the following:
In order to process inner loop operations of higher order polynomials in one cycle, eight intermediate values are stored in parallel in the state memory. Lets call them state values, S[0] to S[7], 1908.
The Polynomial Engine, 1907, performs all incremental operations during the inner loop and outer loop. The Polynomial Engine and state memory are operated at a rate several times faster than the pixel rate of output video. As a result, the Polynomial Engine and state memory is capable of processing several inner loop operations of different polynomials sequentially during the processing time of one pixel (one pixel time).
During the inner loop, the Polynomial Engine sequentially receives the inner loop intermediate values for multiple polynomials stored in the state memory through the bus connection, 1903. It sequentially performs the inner loop incremental operations, and returns the results back to the state memory through the bus connection, 1906. During this time, Address Generator, 1909, generates read addresses and write addresses of inner loop intermediate values and sends the addresses to state memory. The Address Generator makes sure only those polynomials whose bounding box contains the current output pixel position will be sent to polynomial engine for inner loop operation. The Opcode Generator generates inner loop opcode. All inner loop operations are performed during one pixel time.
During the non-rendering time between scan lines, a single outer loop operation is performed for each of the rendered polynomials. The polynomial Engine retrieves outer loop intermediate values from the polynomials stored in the state memory through the connection, 1903. It sequentially performs the outer loop incremental operations and transfer operations, and returns the results back to the state memory through the connection, 1906. During this time, the Address Generator, 1909, generates read addresses and write addresses of intermediate values and sends the addresses to the state memory. The Address Generator makes sure only those polynomials whose bounding box intersects the current scan line will be sent to the polynomial engine for outer loop operation.
The Opcode Generator generates opcodes such as inner loop, outer loop, and memory copy. The results of the polynomial evaluation are available at 1932 during the inner loop operation. Additional outputs such as the direction flag and the Switch Count are available at 1928.
During the time between two video fields, the state memory is reloaded with new polynomial initialization data for the next video field. During the reloading, data is delivered via the input 1926. Output 1930 is used to observe the content of the state memory during testing.
Bounding box information does not change from pixel to pixel, and is stored in register, 1920. It is used by the Address Generator and the Opcode Generator to control inner loop and outer loop operations depending on whether the current pixel is in the bounding box or not.
S[0] to S[7], 2201, in
A “single state value copying” micro-operation from one memory location to another, or in short, a “transfer” micro-operation, is shown as arrows, such as 2205. A “memory copying” micro-operation of all state values from one memory location to another in a single cycle is shown as thick arrow, 2210.
A memory location called “inner loop shadow”, 2206, holds the inner loop starting values. Inner loop shadow memory location is used when performing single cycle re-initialization during rendering by a memory copying micro-operation. At the end of outer loop add operations, 2204, the inner loop shadow is updated by eight transfer micro-operations such as 2205 and 2215. Once updated, it is ready for a single cycle re-initialization, 2210.
Under the illustrated embodiment of this invention, a dividing line is used to partition the video screen into two areas, one for each polynomial function. As a result, near the dividing line, two alternative directions needs to be explored to determine which scan point is the one dividing an inner loop for the two polynomial functions. For each of the two possible directions, a new distance and the new Switch Count need to be calculated. The adders 2246 and 2247 in
Once the direction is determined, both state values for storing the distance should include the corrected distance. In addition, both state values for storing the Switch Count would include the corrected Switch Count. In case of direction 1, this is done by two “transfer” micro-operations: (1) transferring the corrected distance from S[0] to S[2] and (2) transferring the corrected Switch Count from S[4] to S[6], as shown in 2224. In case of direction 2, this is also done by two “transfer” micro-operations: (1) transferring the corrected distance from S[2] to S[0] and (2) transferring the corrected Switch Count from S[6] to S[4], as shown in 2222. Each of these “transfer” micro-operations is performed the same way as 2205 in
When the adders are used for incremental computation, the two operands of each adder are aligned with an offset by several bit positions. This is necessary since a delta value is added to an accumulator value hundreds of time during the inner loop and outer loop. These delta values, as labeled “A” on the inputs of the multiplexers 2020 to 2032, for example 2019, are typically much smaller than the accumulated sum values. These delta values need more bits to represent their fraction part, and need less bits to represent their integer part of their values. Therefore, the higher bits of the delta value should be aligned to the lower bits of the accumulated sum values. This way, the portion of state memory that stores the delta values can have a different decimal point position than the portion of state memory that stores the accumulated sum values.
The adder 2004 can also be used to compare distances within a multi-polynomial area to determine the direction dir1 or direction dir2 as described in
This operand is usually either zero or a delta value. The two multiplexers 2034, and 2036 select the other operand of the adder 2004 and 2012. The multiplexers 2040 to 2054 determine what data is to be stored back to the state memory. The multiplexer 2068 and the Temp Register 2063 provide a data path to transfer any single state value from any location in the state memory to use with any other state values of another memory location by temporarily storing its state value in the temporary register 2063.
An external input, 2064, is used to load the polynomial data during the state memory initialization. It inputs one state value at a time. An external output, 2066, is used to observe the content of the state memory during testing. The polynomial evaluation results are available at 2056 to 2062. Sometimes, other values are outputted, e.g. during the use of the multi-polynomial evaluation method. Switch Count output is available at 2058.
Multiplexers 2020 to 2032 in
The following explains how polynomial engine performs five operations in details.
1. To perform seventh-order polynomial inner loop micro-operation, 2202, or outer loop micro-operation, 2212, in
2. To perform seventh-order polynomial outer loop micro-operation, 2214, in
3. To perform “transfer” operation 2205 in
In the “From S[0]” step, S[0]'s state value from memory location [i+5] is stored in the temporary register. This requires the coordinated operation of multiplexers. Mux 1, Mux2 and adder ensures the S[0] reaches Mux3, 2082. Mux3 selects S[0] and stores S[0] in Temp Register, 2084.
At the following clock cycle, the “To S[4]” step is performed. The “Inner Loop Shadow” memory location is read; its state values are at the inputs of the Mux1 and Mux2. In this step, Mux2, 2078, selects zero to ensure that S[0] to S[7] not altered by any adder. Mux4, 2086 selects S[0] to S[7], i.e. the original state values, from the outputs of the seven adders, 2080, except for the S[4]. To replace S[4], Mux4 selects the output of the Temp Register, 2084, which keeps the S[0] state value of the memory location [i+5]. As the data written back to the “Inner Loop Shadow” memory location, the transfer operation is completed.
4. To perform “memory copying” operation, 2210, in
5. To determine the direction flag and the next Switch Count, 2220, in
Subtraction operation, 2250, finds the direction closer to the dividing line and stores the direction flag in a register. The adder 2004 in
The second stage, 2105 and 2106 are used to perform min or max operation after the first stage. The SC2 register, 2106, stores the current min or max values of the second stage. Its value is used to compare against the result of the first stage.
The shape combination module is controlled according to an equation specified by each video special effect, such as the one listed in
As the six polynomial functions been evaluated sequentially, their values arrive at the input, 2101, of the shape combination module in this order: A, B, C, D, E and F.
As A arrives at input, A is stored in SC1 register.
As B arrives at input, B is compared against SC1 register, and Max(A,B) is stored in SC1 register.
As C arrives at input, C is compared against SC1 register, and Max(A,B,C) is stored in SC1 register.
As D arrives at input, Max(A,B,C) is stored in SC2 register, and D is stored in SC1 register.
As E arrives at input, E is compared against SC1 register, and Max(D,E) is stored in SC1 register.
As F arrives at input, F is compared against SC1 register, and Max(D,E,F) is stored in SC1 register.
Next cycle, Max(D,E,F) from SC1 register is compared against SC2 register. The output, 2109, is the smaller of the two:
In the case of highlight example in
A gradient color is a smooth transition from one color to another color. The smooth change of polynomial function may be used to produce gradient color. The color of a particular pixel located inside the face of the characters of the word is decided based on the value of a polynomial function evaluated at the pixel location. The gradient color on the character faces may change as the parameters of the polynomial function changes over time.
In this illustration, the face pixels of the word are identified by a “face bitmap” stored in video source N, 18 in
Adding an animated word filled with a gradient color to the output video may be done by evaluating two sets of polynomials. The first set of polynomials defines the texture mapping for the face bitmap, and applying parallelogram warping to it to produce animation. The second set of polynomials defines the gradient color. Instead of obtaining color from a source video, the polynomial processing system in
The controller 12 in
Scan lines in the range 2324 evaluate polynomials for both highlight effect and heart shape effect. Near the edge of the heart shaped object, 2304, its polynomial value is used to blend the source video inside the heart shape, 2317, with the background video, 2315.
Since the star highlight, 2302, is at the top layer, its polynomial value is used to blend white color with heart shaped object, 2304, wherever they overlap. In other area, the white color is blended with the background video, 2315.
Scan lines in the range 2328 evaluate polynomials for both heart shape effect and gradient colored word effect. Since the gradient colored word effect, 2306, is at the layer above the heart shaped object, only the gradient colored word is rendered wherever they overlap.
An alternative way to increase rendering efficiency is applying the “multi-polynomial evaluation method” by using a dividing line such as 2316. The polynomials for gradient colored word effect are evaluated on one side of this dividing line while the polynomials for highlight effect are evaluated on the other side of this dividing line.
If we add more polynomial functions inside several isolated bounding boxes such as 2340, 2342, 2344, no additional polynomial evaluation hardware is needed. These additional highlights, 2340, 2342 and 2344, can share the same polynomial evaluation hardware with the existing polynomials for the heart shape, 2304, the large highlight, 2302, and the word, 2306.
To add all the video special effects to the output video in
In this illustration, the pipelined polynomial processing system consists of three polynomial engines, 2410, 2414, 2418, with pipeline registers between the polynomial engines, 2412, 2416. During the inner loop operation, polynomial's inner loop intermediate values are retrieved from the state memory, and are used to evaluate the polynomial function for three consecutive pixels positions before writing back to the state memory. Three shape combination modules are used. Each is responsible for combining the values of different polynomials sequentially for a single pixel. As a result, the pipelined polynomial processing system increases the throughput of the polynomial computation by generating three sets of “shape combined” polynomial values simultaneously.
A specific embodiment of method and apparatus for rendering images has been described for the purpose of illustrating the manner in which the invention is made and used. It should be understood that the implementation of other variations and modifications of the invention and its various aspects will be apparent to one skilled in the art, and that the invention is not limited by the specific embodiments described. Therefore, it is contemplated to cover the present invention and any and all modifications, variations, or equivalents that fall within the true spirit and scope of the basic underlying principles disclosed and claimed herein.