The following detailed description may be best understood when taken in conjunction with the accompanying drawings, of which:
The following description relates to the rendering of surfaces directly in terms of their polynomial representations. By rendering surfaces directly, as opposed to with an approximating triangle mesh, the techniques described herein avoid tessellation artifacts and the need for LOD management. To perform such direct rendering, the computational strengths of the graphics hardware are leveraged by associating, as vertex attributes, the unique elements of a symmetric tensor that represents the coefficients of the polynomial that defines the surface to be rendered within a volume of space. Additionally, because the polynomial coefficients of each surface element are resolution-independent, as opposed to the more conventionally used triangle mesh, which is resolution-dependent, the memory and bandwidth requirements are reduced.
The techniques described herein are focused on rendering of shapes described by polynomials of any order, though polynomials up to fourth order can utilize straightforward mechanisms to identify the zeros of the polynomial, while higher order polynomials can require more computationally expensive mechanisms to identify the zeros of the polynomial. These shapes are considered within bounding Bézier tetrahedra and are rendered on a piecewise fashion, per each tetrahedron. The rendering techniques are able to model piecewise smooth surfaces by taking advantage of this restriction to a bounded tetrahedron combined with continuity conditions between adjacent tetrahedra.
Although not required, the description below will be in the general context of computer-executable instructions, such as program modules, being executed by a computing device. More specifically, the description will reference acts and symbolic representations of operations that are performed by one or more computing devices or peripherals, unless indicated otherwise. As such, it will be understood that such acts and operations, which are at times referred to as being computer-executed, include the manipulation by a processing unit of electrical signals representing data in a structured form. This manipulation transforms the data or maintains it at locations in memory, which reconfigures or otherwise alters the operation of the computing device or peripherals in a manner well understood by those skilled in the art. The data structures where data is maintained are physical locations that have particular properties defined by the format of the data.
Generally, program modules include routines, programs, objects, components, data structures, and the like that perform particular tasks or implement particular abstract data types. Moreover, those skilled in the art will appreciate that the computing device need not be limited to conventional personal computers, and includes other computing configurations, including hand-held devices, multi-processor systems, microprocessor based or programmable consumer electronics, network PCs, minicomputers, mainframe computers, and the like. Similarly, the computing device need not be limited to a single computing device as the mechanisms may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote memory storage devices.
With reference to
Of relevance to the descriptions below, the computing device 100 also includes graphics hardware, including, but not limited to, a graphics hardware interface 190 and a display device 191. The graphics hardware interface 190 can be compatible with the system bus 121 for communication with the central processing unit 120, system memory 130, or any other component or peripheral of the computing device 100. In addition, the graphics hardware interface 190 can comply with one or more display interfaces for connection to the display device 191. Traditionally, the graphics hardware interface 190 is a graphics card installed within the computing device 100, though it can also be a graphics chipset, an external graphics adapter or any other type of interface. The display device 191 can be any type of display device compatible with the graphics hardware interface 190, including, but not limited to, a monitor, such as a cathode ray tube (CRT) monitor or a liquid crystal display (LCD) monitor, a projector, eyewear, such as virtual monitor eyewear or three dimensional imaging eyewear, or any other graphics display hardware. The structure and operation of the graphics hardware interface 190 will be further described below with reference to
The computing device 100 also typically includes computer readable media, which can include any available media that can be accessed by computing device 100 and includes both volatile and nonvolatile media and removable and non-removable media. By way of example, and not limitation, computer readable media may comprise computer storage media and communication media. Computer storage media includes media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disks (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by the computing device 100. Communication media typically embodies computer readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media. By way of example, and not limitation, communication media includes wired media such as a wired network or direct-wired connection, and wireless media such as acoustic, RF, infrared and other wireless media. Combinations of the any of the above should also be included within the scope of computer readable media.
The system memory 130 includes computer storage media in the form of volatile and/or nonvolatile memory such as read only memory (ROM) 131 and random access memory (RAM) 132. A basic input/output system 133 (BIOS), containing the basic routines that help to transfer information between elements within computing device 100, such as during start-up, is typically stored in ROM 131. RAM 132 typically contains data and/or program modules that are immediately accessible to and/or presently being operated on by processing unit 120. By way of example, and not limitation,
The computing device 100 may also include other removable/non-removable, volatile/nonvolatile computer storage media. By way of example only,
The drives and their associated computer storage media discussed above and illustrated in
The computing device 100 may operate in a networked environment using logical connections to one or more remote computers, such as a remote computing device 180. The remote computing device 180 may be a personal computer, a server, a router, a network PC, a peer device or other common network node, and typically includes many or all of the elements described above relative to the computing device 100. The logical connection depicted in
Turning to
The graphics memory 250 can comprise some of the programmatic routines, instructions and objects used by the graphics processing unit (GPU) 225. The graphics memory 250 can also comprise a buffer 260 for use in rendering three dimensional representations in a manner described further below, and can comprise a texture cache 240. The GPU 225 can utilize the graphics memory 250 to perform the appropriate processing of image data; the processing generally being componentized as illustrated in
Initially, display procedures 205, which describe an image to be rendered, are input into vertex shader units 210, which generate representations of the image within the context of series of vertices of triangles comprising a triangle mesh that approximates the image to be rendered. The vertex shader units can utilize parallel computing techniques to more efficiently represent the image within the triangular mesh framework, as much of the computation involved can be performed in parallel. The vertex data is then input into a rasterizer, which interpolates the data between the vertices to develop a sample set of points in image space, which can then be shaded and have texture added to them. These points are then passed to a series of pixel shader units 230 which perform shading of the points, as well as adding and manipulating textures. Like the vertex shader units 210, the pixel shader units 230 can utilize parallel computing techniques to more efficiently perform the shading and texture manipulation, since much of the computation involved can be performed in parallel. Pixel shader unit computation is frequently performed under the control of pixel shader programs, which are GPU-executable programs written to take advantage of the pixel shader units.
Textures, which can be pre-loaded into graphics memory 250, are cached in the texture cache 240. In various implementations, textures can map directly to illustrated shapes, or can alternatively be used as canonical texture spaces, which are not necessarily directly tied to an image space. Once processing is complete, the image points can then be placed in a buffer 260, such as a frame buffer, which holds color values, or a z-buffer, which holds pixel depth values along the z-axis (which, as used herein, corresponds to the viewing direction). These buffers allow for resolution of hidden surfaces by enabling the determination of which surfaces are in front and which are behind. Thus, when colors are computed by the pixel shader units, along with a depth along the z-axis for that color, the newly-computed depth can be compared to a corresponding value in the z-buffer. Then, if the depth value is less than that in the z-buffer, the newly-computed color is stored in the proper frame buffer location. However, if the depth value is greater than that in the z-buffer, the newly-computed color is behind the color value already in the buffer, and it is, therefore, ignored. In an alternative implementation (not illustrated), the image points can be written to a buffer which is kept in the graphics memory in order to increase pixel-writing speed.
Programmable graphics hardware enables third-party programmers to substitute their own code and routines for the default vertex shader units 210 and the default pixel shader units 230. Sufficiently advanced graphics hardware interfaces can accept display procedures 205 that specify an image to be rendered within the context of a collection of three-dimensional tetrahedra. In such a case, the procedures described below can be performed by the vertex shader units 210 as executed by the GPU 225. In an alternative embodiment, where the graphics hardware interface accepts display procedures 205 that specify an image to be rendered within the context of a collection of two-dimensional triangles, the procedures described below can be performed by vertex shader units as executed by the CPU 120 of the computing device 100. In such a case, the output of the vertex shader units can be passed to the graphics hardware interface 190, and thereby to the rasterizer 220, via the system bus 121 and the system bus interface 235.
Complex three-dimensional surfaces can be represented as a piecewise smooth collection of mathematically simpler surfaces within small contiguous regions. Turning to
Also illustrated are three tetrahedra 321, 322 and 323 that represent a small volume of space, a quantum of space, within which the surface 310 may be described in a mathematically simpler fashion. Thus, as shown, tetrahedra 321, 322 and 323 divide surface 310 into subsections 311, 312 and 313, respectively, each of which is bounded by the edges of a tetrahedron. The divided subsections do not need to be similarly shaped or sized.
Because the tetrahedra share common sides, and because the bounded surfaces can be defined to represent a contiguous surface over the tetrahedral boundaries, the entire surface appears whole when the three bounded surfaces are rendered together. Thus, for graphical display purposes, the rendering of subsections 311, 313 and 313 will result in the rendering of the surface 310. When rendering a shape as the composition of a plurality of tetrahedra, the shape may be divided after definition according to a partitioning method, or the shape may be defined at the outset as a plurality of different mathematically defined shapes, each bound in a tetrahedron, that are situated so as to appear to be a contiguous whole upon rendering.
The rendering techniques described herein take advantage of tetrahedra partitioning by representing surfaces in Bézier tetrahedral form. A Bézier tetrahedral form of a surface is defined within a bounding tetrahedron. For example, if T is a tetrahedron with four vertices vi=[xiyizi1] (for i=0, 1, 2, 3), T can be encoded in a matrix
Given such a T, an algebraic surface of degree d can be defined by T as:
where bi j k l are scalar-valued Bézier coefficients, or “weights,” that control the shape of the surface and r, s, t and u are barycentric coordinates of a point in space with respect to the tetrahedron T, defined above. The tetrahedron T is generally called a “world space domain tetrahedron.” As will be known by those skilled in the arts, barycentric coordinates sum to 1. In other words, r, s, t and u are barycentric coordinates in that r+s+t+u=1.
When r, s, t and u are all positive and satisfy Equation (1), the surface to be rendered is inside the tetrahedron defined by T. Furthermore, each weight bi j k l is associated with a position in space, given by
Turning to
The advantage of using a Bézier tetrahedral form for shape rendering is that solutions where r,s,t, u ε [0,1] are guaranteed to lie within the convex hull of the tetrahedron T. This restriction to a tetrahedron has several benefits. For example, it has been shown that it is possible to state simple explicit continuity conditions between the weights of a pair of adjacent tetrahedra such that the composite surface is continuous or smooth. Dealing with tetrahedral elements in a graphics system also enables view frustum culling, as well as extent and interference testing.
Each Bézier tetrahedron has a “blossom,” or “polar form,” that is a symmetric multi-affine map unique to that Bézier tetrahedron. As will be known by those skilled in the art, mathematically, a second order algebraic surface is given by the equation:
is a symmetric matrix. This is known as the “blossom,” or “polar form,” of the polynomial Q. For Bézier tetrahedra greater than second order, a blossom can be evaluated using a symmetric tensor. As will be understood by those skilled in the art, a tensor is a higher dimensional analog of a matrix, where the number of indices indicates the “rank” of the tensor. One advantage of using tensor notation is that blossoms can be evaluated in terms of dot products, which are native operations on many GPUs.
Tensor algebra generalizes the notion of dot product and matrix multiplication to “tensor contraction,” which can be represented using Einstein index notation where contravariant indices are indicated as superscripts and covariant indices as subscripts. An expression that has the same symbol, typically a Greek letter, in a superscript and a subscript indicates that an implied summation is performed over that index.
Thus, in tensor notation, Equation (1) can be represented as d contractions
rα
where B is a symmetric rank d tensor containing the Bézier weights, and r=[r s t u]. The elements of the tensor B are assigned Bézier weights by Bα
To generate a three-dimensional surface using a display device that displays two-dimensional images, the three-dimensional surface is projected onto a two-dimensional plane. As indicated previously, the surface to be generated is divided into segments bounded by tetrahedra. Thus, a tetrahedron projected onto a two-dimensional plane can act as a basis for determining how to display the three-dimensional surface. Turning to
The plane onto which the tetrahedron is projected is termed “screen space” and the pixel coordinates of the pixels of the screen are represented as an ordered pair of x and y in a conventional manner. Screen space can be thought of as a four dimensional projective space where [x y]ε[−1,1]×[−1,1] and depth z correspond to the value of the fourth dimensional axis, w, being equal to 1.
By the definition of barycentric coordinates, a point “x” in world space, represented by the vector x, can be expressed as: x=r·T. The composite transform from barycentric coordinates to screen space, where the point x in screen space is represented by the vector {tilde over (x)}, can be expressed as: {tilde over (x)}=r(T·M). Consequently, the barycentric coordinates of a screen space point are: r={tilde over (x)}·(M−1·T−1)={tilde over (x)}·W.
Given the above, the tensor weights of tensor B can be transformed to screen space, yielding a tensor {tilde over (B)}, as follows: {tilde over (B)}β
{tilde over (x)}β
By transforming the bounding tetrahedron vertex matrix T and the weight tensor B into screen space, the viewing rays, for purposes of generating the three-dimensional surface on the display device, become parallel to the z axis. An alternative embodiment contemplates transforming the viewing rays into the barycentric coordinate system of each Bézier tetrahedron. In such a case a univariate equation in the space of the barycentric coordinate system of each Bézier tetrahedron would be solved to generate the three-dimensional surface on the display device.
Returning to
Turning to
The divided surface is received at step 610 and at step 615, a tetrahedron is selected for generating the subsection of the surface contained within that tetrahedron. In one implementation, the first tetrahedron may be chosen to be one that is further away from the eye point of a viewer than other tetrahedra. In another implementation, the tetrahedron may be chosen randomly or by other criteria.
Step 620 determines whether the selected tetrahedron is case a or case b, as illustrated in
Turning back to
The surface contained within the selected triangle is rendered at step 635, in a manner that will be described in more detail below with reference to
Turning back to step 635, in order to generate, within the selected triangle, the two-dimensional representation in screen space of a three-dimensional surface, the screen space pixels contained within the triangle are evaluated. For any pixel within the triangle, given by its standard [x y] position, the surface to be generated exists at that pixel when the vector {tilde over (x)}=[x y z1] satisfies Equation (3), given above. To determine the z-axis boundaries of the tetrahedron, being projected onto the pixel at [x y], the minimum and maximum z values can be determined at the vertices and then interpolated across the projected tetrahedron.
Turning back to
For a particular pixel, having specific x and y values and a range of z values between the zmin and zmax for that pixel, the determination of whether vector {tilde over (x)}=[x y z 1] satisfies Equation (3) becomes a determination of the roots of a degree d polynomial in z. To simplify the root-finding, the univariate polynomial can be expressed in Bézier form:
To correspond Equation (4) with Equation (3), given above, vectors p and q can be defined such that, for a pixel having a position [x y] and a pixel-specific zmin and zmax,
p=[x y z
min1] and q=[x y zmax1]. (5)
Consequently, a point in screen space, represented by the vector {tilde over (x)}, can be expressed as {tilde over (x)}=(1−v)p+vq. Plugging this equation into Equation (3), from above, the coefficients ai of Equation (4) can be written as:
unique elements. Stated mathematically:
pβ
Turning to
The vertices considered by step 705 are the vertices of the screen space triangle selected by step 630, and each vertex can be represented as wj=[xj yj zminj 1]. A vector p can be represented in the form p=rw0+sw1+tw2, and, consequently, by the linearity of dot products p·{tilde over (B)}=r(w0·{tilde over (B)})+s(w1·{tilde over (B)})+t(w2·{tilde over (B)}). The partially collapsed rank d−1 tensor p·{tilde over (B)}, which is the common factor in each of the coefficients (with the exception of ad) listed by Equation (6), can, therefore, be determined at each pixel by evaluating wj·{tilde over (B)} at the vertices of the triangle and then interpolating across the triangle. Because the rasterizer 220 can be very efficient at interpolating, the
unique elements of the symmetric tensor wj·{tilde over (B)} can be associated with the relevant vertex (represented by the variable j) by being stored as vertex attribute data.
Other data which is to be interpolated across the triangle can likewise be stored as vertex attribute data. For example, as indicated previously, the maximum and minimum z values, zmin and zmax. for the vertices are also stored as vertex attributed data so that the zmin and zmax values for any pixel can be obtained from the interpolation.
Once the vertex attribute data has been determined, such as in the manner described above, either by the CPU 120 or the GPU 225, it can be provided to the rasterizer 220 for interpolation across the triangle, as indicated by step 710. In interpolating the vertex attribute data, the rasterizer will, in a manner known to those skilled in the art, take into account the number of pixels, or other display quanta, relevant to the triangle and output interpolated attribute data for each pixel. The rasterizer will also, again in a manner known to those skilled in the art, interpolate the attribute data in a non-linear fashion to account for any transformations based on the point from which the surface to be rendered is being viewed. Such viewpoint transformations are common in the computer graphics arts and rasterizers are generally configured to interpolate in a “perspectively correct” manner to account for viewpoint transformations. The per-pixel data output by the rasterizer can then be used by the pixel shader 230 to determine the color and z-value of the relevant pixel.
Initially, the symmetric tensor p·{tilde over (B)} is reconstructed at each pixel by using the interpolated data and the points represented by the vectors p and q, as given by Equation (5). Subsequently, pixel coordinates x and y can be determined by reverse mapping the contents of a hardware register through the viewpoint transform that was used to change the point in space from which the surface being rendered is being viewed. Additionally, the maximum and minimum z values can be obtained from the interpolated data, as indicated above. The a0, . . . , ad-1 coefficients of Equation (4) can then, be computed as given by Equation (6). The final coefficient, ad, can be determined by expressing the point given by the vector q as q=p+δz, where δ=(zmax−zmin) and z=[0 0 1 0]. The coefficient ad can then be given as ad=(p+δz)β
which is a polynomial in δ.
Because the dot product of z with a tensor essentially selects the third element of that tensor, the expression for ad simplifies to a weighted sum of elements of tensors that have already been determined while computing the coefficients a0, . . . , ad−1. The one exception is the value given by zβ
Once the coefficients of the univariate polynomial expressed in Bézier form given by Equation (4) have been determined, the roots of the polynomial can be calculated to determine whether the surface to be generated is within the triangle being considered. In one implementation, step 720 can be performed prior to the completion of step 715 by merely examining the values of the Bézier coefficients of Equation (4). If all of the ai have the same sign then, according to the convex hull property, there are no roots within the range [0, 1]. Stated differently, there is no z value for the particular pixel being considered for which Equation (4) will be satisfied. Consequently, there is no component of the surface visible at the pixel being considered, and the pixel shader need not perform any further computations. Thus, as shown by
If, however, all of the coefficients do not have the same sign, then step 715 can proceed to determine the roots of Equation (4). In one embodiment, the roots of Equation (4) are determined iteratively in a manner known to those skilled in the art. Iterative root finding can be useful with polynomials of degree 5 or higher. In another embodiment, the roots of Equation (4) are determined analytically. For polynomials of degree 2, 3 or 4, relatively simple solutions exist when the polynomial is express in the power basis. However, the relationship between power and Bernstein basis is given by a homogenous projective transformation in parameter space. Thus, while in the power basis, conventional polynomial root finding attempts to “depress” the polynomial by translating it in parameter space to generate a new polynomial with one coefficient being equal to 0, in Bernstein basis, the translation between an original space (nominated [x w] without reference to prior uses of those same variable names) and a depressed space [{tilde over (x)} {tilde over (w)}] is given by a matrix product:
Given an arbitrary first row [p q], a matrix that can depress the polynomial can be expressed as:
where the subscripts on the term “f” indicate partial derivatives. The determinant of this transformation is d times f(p,q). Consequently, any [p q] that is not a root of the polynomial f can result in a transformation that is not singular. In one implementation, [0 1] can be used as [p q], generating the original translation. In another implementation, [1 0] can be used as [p q], effectively reversing the order of the coefficients and thereby solving for w/x instead of x/w, and subsequently inverting the results. In yet another implementation, [1 1] or [1 −1] can be used as [p q], performing a 45 degree rotation in parameter space. Selection among the various possibilities can be based on whichever will provide the largest value of |f (p,q)|. Additionally, to avoid singularities, at least one more [p q] selection than the number of roots of the polynomial can be used.
Subsequently, the roots of the polynomial can be determined by known techniques. The results can then by transformed back to the original [x w] space by the 2×2 transform. In one implementation, the conversion back to [v 1] space can be combined with the transformation to the original [x w] space, as that is another 2×2 transform.
Once step 715 has determined the roots of the polynomial given by Equation (4), step 720 can determine if there are any real roots, such that the surface to be generated is visible at the pixel being considered. If there are no real roots, the surface is not visible at that pixel, and the process can end as illustrated. However, if real roots exist at the pixel being considered, the surface normal can be computed at step 725.
In one implementation, the smallest root v can be used to compute a surface normal at the surface {tilde over (x)}=(1−v)p+vq by collapsing the tensor {tilde over (B)} down to the tangent plane using 1β
As with Equation (8), above, the tangent plane is a weighted sum of previously computed tensors and a constant given by:
zβ
Once the tangent plane, or surface normal, is determined, a lighting evaluation can be performed, as illustrated by step 730, in a manner known to those skilled in the art. Finally, at block 735, the result of the lighting calculation, which is a combination of a color and a z-value, is sent to the buffer to be physically presented via the display device 191.
In view of the many possible variations of the subject matter described herein, we claim as our invention all such embodiments as may come within the scope of the following claims and equivalents thereto.