The present invention relates generally to rendering graphics models, and more particularly to rendering point-based 3D surface models with surface splatting in a graphics hardware rendering engine.
Point-based surface models define a surface of a 3D graphics object by a set of sample points. Point-based rendering generates a continuous image of the discrete sampled surface points. The points on the surface are commonly called surface elements or “surfers” to indicate their affinity with picture elements (pixels) and volume elements (voxels).
A point-based representation has advantages for graphics models with complex topologies in rendering applications where connectivity information is not required or available, or for fusion of data from multiple sources, see for example, Levoy et al., “The Use of Points as Display Primitives,” Technical Report TR 85-022, The University of North Carolina at Chapel Hill, Department of Computer Science, 1985, Zwicker et al., “Surface Splatting,” SIGGRAPH 2001 Proceedings, pp. 371–378, 2001, and U.S. Pat. No. 6,396,496 issued to Pfister et al. on May 28, 2002 “Method for modeling graphical objects represented as surface elements,” incorporated herein by reference.
Point-based models can be acquired directly using 3D scanning techniques, or by conversion from polygon models with textures, see Levoy et al., “The Digital Michelangelo Project: 3D Scanning of Large Statues,” SIGGRAPH 2000 Proceedings, pp. 131–144, 2000, and Pfister et al., “Surfels: Surface Elements as Rendering Primitives,” SIGGRAPH 2000 Proceedings, pp. 335–342, 2000.
Most prior art point-based rendering methods have focused on efficiency and speed. Some of those methods use OpenGL and hardware acceleration to achieve interactive rendering performances of two to five million points per second, see Rusinkiewicz et al., “QSplat: A Multiresolution Point Rendering System for Large Meshes, SIGGRAPH 2000 Proceedings, pp. 343–352, 2000, and Stamminger et al., “Interactive Sampling and Rendering for Complex and Procedural Geometry,” Proceedings of the 12th Eurographics Workshop on Rendering, pp. 151–162, 2001.
However, none of those techniques supports anti-aliasing for models with complex surface textures. Recently, Zwicker et al. described elliptical weighted average (EWA) surface splatting, see Zwicker et al. “Surface Splatting,” SIGGRAPH 2001 Proceedings, pp. 371–378, 2001, and U.S. patent application Ser. No. 09/842,737 “Rendering Discrete Sample Points Projected to a Screen Space with a Continuous Resampling Filter,” filed by Zwicker et al., on Apr. 26, 2001, incorporated herein by reference.
Those methods uses anisotropic texture filtering and an image-space formulation of an EWA texture filter adapted for irregular point samples, see Greene et al., “Creating Raster Omnimax Images from Multiple Perspective Views Using the Elliptical Weighted Average Filter,” IEEE Computer Graphics & Applications, 6(6):21–27, 1986, and Heckbert, “Fundamentals of Texture Mapping and Image Warping,” Master's Thesis, University of California at Berkeley, Department of Electrical Engineering and Computer Science, 1989.
However, a software implementation of EWA surface splatting only achieves a rendering performance of up to about 250,000 points per second.
Polygon and point primitives can also be combined into efficient rendering systems that select one or the other based on image space projection criteria, see Chen et al., “POP: A Hybrid Point and Polygon Rendering System for Large Data,” Proceedings of IEEE Visualization, pp. 45–52, 2001, and Cohen et al., “Hybrid Simplification: Combining Multi-Resolution Polygon and Point Rendering,” Proceedings of IEEE Visualization, pp. 37–44, 2001. Both of those systems make use of graphics hardware to achieve real-time performance for reasonably complex models. However, neither system handles surface textures, and the introduction of connectivity information further diminishes the advantages of pure point-sampled models.
Image Space EWA Splatting
In the image space EWA splatting framework as described by Zwicker et al., objects are represented by a set of irregularly spaced points {Pk} in three dimensional object space without connectivity information, in contrast with polygon or triangle models which do contain adjacency or connectivity information.
Each zero-dimensional point is associated with a location, a surface normal, a radially symmetric basis function rk, and scalar coefficients wrk, wgk, wbk that represent continuous functions for red, green, and blue color components. The basis functions rk are reconstruction filters defined on locally parameterized domains. Hence, the functions define a continuous texture function on the model's surface as represented by the discrete points.
where uk is the local coordinate of each point Pk.
In the ideal resampling framework described by Heckbert et al., rendering the texture function fc(u) yields a continuous output function gc(x) in image space that respects the Nyquist criterion of the output pixel grid. Thus, aliasing artifacts are avoided. The rendering process includes the following steps.
First, the texture function fc(u) is warped to image space using a local affine mapping of the perspective projection at each point. Then the continuous image space signal is band-limited by convolving it with a prefilter h, yielding the output function gc(x), where x are image space coordinates. After rearranging the mathematical expressions, the output function can be expressed as a weighted sum of image space resampling filters ρk x:
ρk(x)=(r′k{circle around (x)}h) (x−mk(uk)). (3)
where
Here, the resampling filter ρkx is written as a convolution of a warped basis function rk(x)=rk(x)(m−1(x), and the pre-filter h(x).
To simplify the evaluation of ρk at each point uk, a local affine approximation
given by the Taylor expansion of m at uk, truncated at the linear term:
where xk=m(uk) and the Jacobian
In the EWA framework, elliptical Gaussians are selected as basis functions rk and as pre-filters h because of their unique properties. Gaussians are closed under affine mappings and convolution. Hence, the resampling filter ρk can be expressed as an elliptical Gaussian, as described below. A 2D elliptical Gaussian Gv (x) with variance matrix VεR2×2 is defined as
where |V| is the determinant of V. The variance matrices of the basis functions rk and the low-pass filter h are defined with Vkk and Vh, hence rk=GV
Note that a typical choice for the variance of the low-pass filter is the identity matrix I. By substituting the Gaussian basis function and prefilter in Equation (3), a Gaussian resampling filter
can be obtained, which is called an image space EWA resampling filter, for additional details see Zwicker et al.
Image space filters are suitable for software implementations. However, hardware graphics engines cannot determine such filters directly in image space. Hence the rendering performance is severely degraded, e.g., by an order of magnitude or more, as indicated above.
Therefore, it is desired to provide a rendering system that interactively renders complex point-based models with arbitrary surface textures at a highest possible quality. The rendering system should take advantage of advances in PC graphics hardware, namely, the ever increasing performance of graphic processing units (GPUs), and programmable shading. The system should also provide anisotropic texture filtering.
Elliptical weighted average (EWA) surface splatting is a technique for high quality rendering of point sampled 3D objects. EWA surface splatting renders water-tight surfaces of complex point models with high quality, anisotropic texture filtering. The invention provides a method for EWA surface splatting using modem PC graphics hardware.
The invention provides an object space formulation of the EWA filter which is adapted for accelerated rendering by conventional triangle-based graphics rendering hardware. The object space EWA filter is rendered using programmable vertex and pixel shaders, fully exploiting the capabilities of today's graphics processing units (GPUs). The method according to the invention renders several million points per second on current PC graphics hardware, an order of magnitude more than a pure software implementation of EWA surface splatting of the prior art.
System Architecture
During a first pass, a first polygon 2, e.g., in the form of a quadrilateral (quad), is centered 10 on each point Pk 1 (or each vertex of a triangle in a polygon mesh). The plane of the quad 2 is perpendicular to the point's surface normal. The quad 2 is then offset 20 and translated 30 by a depth threshold zt 3 along a viewing ray 4 to avoid occlusions. The depth-offset and translated quads are then rendered 40 using hardware rasterization of a conventional graphic processing units (GPU). The rendering accumulates 50 depth values of the opaque quads in a z-buffer as a depth image 5. The accumulation only retains depth values for pixels that are closest to the point of view, and discards all other (occluded) depth values. The depth image 5 is used later to only render pixels representing visible portions of the 3D model 1.
During a second pass, as shown in
It should be noted that the rendering can be done in a single pass when two depth buffers are used, one storing the actual depth values, and the other storing the offset depth values. It should also be noted that normalization can be performed in another pass. It should also be noted that the first and corresponding second polygon for each point can be the same.
The method and system are now described in greater detail.
Object Space EWA Splatting
In Equation (3), given above, the resampling filter is expressed as a function of image or screen space coordinates x, which is suitable for software implementations. However, hardware graphics engines cannot compute such a filter directly in image space. To make EWA splatting amenable to acceleration by graphics hardware, as desired by the invention, we formulate the resampling filter as a function on a parameterized surface in object space. Then, we can exploit graphics hardware to project the model's surface to image space, yielding the resampling filter in image space as in (3).
To do this, we rearrange Equation (3) using a local affine approximation mk:
yielding an object space resampling filter ρ′k defined in coordinates u of the local surface parameterization. Note that in contrast to the image space resampling filter, the object space resampling filter according to the invention includes a convolution of the original basis function rk(u), and a warped (low-pass) prefilter h′k(u)=|Jk|h(Jk(u)).
As shown in
We use Gaussians as basis functions and prefilter in Equation (6). This yields an analogous expression to Equation (5), which we call an object space EWA resampling filter:
ρ′k(u)=GV
Finally, we use Equations (6) and (7) to reformulate the continuous output function of Equation (2) as a weighted sum of object space EWA resampling filters:
Hardware Accelerated Rendering
Our hardware accelerated surface splatting algorithm is based on Equation (8) and uses a two-pass approach, emulating an A-Buffer, see Carpenter, “The A-buffer, an Antialiased Hidden Surface Method. In Computer Graphics,” Volume 18 of SIGGRAPH Proceedings, pages 103–108, 1984.
The first pass, described below in greater detail, performs visibility splatting by rendering an opaque polygon as depth (z) values for each point into the Z-buffer of depth image 5. The second pass performs the operations of Equation (8) as follows.
As shown in
During rasterization, we perform depth tests using the depth values in the Z-buffer or depth image 5 that was generated during the first rendering pass to determine whether the splats are visible. This ensures that for each pixel only those splats that represent the surface closest to the viewer are accumulated.
In contrast with the prior art, our point-based rendering uses semi-transparent splats with antialiasing, and a textured polygon to represent each point 1. Also, in the prior art approach, the position of each polygon in object space is static, i.e., determined before rendering. In contrast, we dynamically determine view dependent point positions during rendering to avoid aliasing, as described below.
Visibility Splatting
As shown in
To render a point based model of an object without artifacts, we must accumulate all the splats of the visible surface closest to the viewer while discarding all other splats. During rasterization of the splats, we decide for each pixel whether to discard or accumulate the current contribution by comparing the depth value of the splat with the depth image 5 that was generated as described above. However, to prevent contributions of the visible surface from being accidentally discarded, the depth image is translated away from the viewpoint by a small depth threshold zt.
A simple solution is to translate the depth image 5 by zt along the z-axis in camera space, as described by Rusinkiewicz et al., “QSplat: A Multiresolution Point Rendering System for Large Meshes,” SIGGRAPH 2000 Proceedings, pages 343–352, 2000.
However, as shown in
Determining Object Space EWA Resampling Filter
We construct a local parameterization of the object surface around the point Pk 1 by approximating the surface with its tangent plane given by the normal nk. The parameterization is defined by selecting two orthogonal basis vectors {tilde over (s)} and {tilde over (t)} in this plane, attached to the position õ of the point Pk. Note that {tilde over (s)}, {tilde over (t)}, and õ are 3×1 vectors defined in object space. Hence, a point u with components u0 and u1 in local surface coordinates corresponds to a point po(u)=õ+u0. {tilde over (s)}+u1{tilde over (t)} in object space.
If we assume that the transformation from object space to camera space only contains uniform scaling S, rotation R and translation T, then a point u corresponds to the following point pc(u) in camera space:
where o is the point position in camera space, while s and t are the basis vectors defining the local surface parameterization in camera space.
Next, we map the points from camera space to image or screen space. This includes the projection to the image plane by perspective division, followed by a scaling with a factor η to image coordinates, i.e., a viewport transformation. The scaling factor η is determined by the view frustum as follows:
where vh stands for the viewport height, fov is the field of view of the viewing frustum, and Znear specifies the near clipping plane. Hence, image space coordinates x=(x0, x1) of the projected point (u0, u1) are determined as
So the Jacobian Jk, including the partial derivatives of Equation (10) evaluated at (u0, u1)=(0, 0), is
Determining Point Parallelogram Vertex Position
After the Jacobian matrix is determined, the object space EWA resampling filter defined on the locally parameterized surface is written as:
ρ′k(u)=GM
As shown in
Mk=Rot(θ)·Λ·Λ·Rot(θ)T, (11)
where
The rotation matrix R (θ) includes the eigenvectors, and the scaling matrix Λ
includes the square roots of the eigenvalues of Mk. With a linear relationship
u=Rot(θ)·Λ·y; (12)
we have yTy=uTM−1ku, and we can rewrite GMk as
Equation (13) represents a unit Gaussian in y, which is mapped to the elliptical Gaussian resampling filter using Equation (12) as shown in
Although the Gaussian resampling filter has infinite support in theory, in practice it is determined only for a limited range of the exponent β(y)=½yTy. Hence, we select a cutoff radius c such that B(y)≦c, where a typical choice is c=1. Thus, the alpha texture actually encodes the unit Gaussian in the domain
Each vertex v 901 has texture coordinates {(0, 0), (0, 1), (1, 1), (1, 0)} to encode the vertex positions of the deformed quadrilateral and to perform texture mapping 910. Given vertex texture coordinates v=(v0, v1) 901, we determined the camera space position pcu 904 as shown in
First, we need to map 910 the texture coordinates v 901 to the coordinates y 902 in the domain of the unit Gaussian by scaling them according to the cutoff radius c:
Then, we deform 920 the textured quadrilateral using Equation (12), yielding coordinates u 903 of the local surface parameterization. With Equation (9), we finally determine 930 the vertex positions pcu 904 of the parallelogram in camera space.
Determining Optimal Texture Size
To make full use of the eight bit precision of the alpha texture, we encode the non-constant part from Equation (13) in each texel, i.e., g(y)=e−1/2y
Although a larger texture size increases the precision of the discrete representation of the 2D Gaussian function, the quantization of the function values to eight bits leads to redundancy in high resolution alpha textures, because nearby texels can always map to the same quantized value.
We use a square texture with resolution len× len. Because the unit Gaussian function is rotation invariant, we can represent g(y) as g′(r)=e−1/2r
From this, it follows that
The optimal texture resolution corresponds to the smallest value of len that satisfies the above condition. For the typical choice c=1, we find len=438.
Implementation
Programmable Shader Computations
Our hardware accelerated surface splatting method is implemented using programmable vertex and pixel shaders. Programmable shaders provide efficient GPU level computing, see Lindholm, “A User-Programmable Vertex Engine,” SIGGRAPH 2001 Proceedings, pages 149–158, 2001.
Vertex Shader Computations
During the first rendering pass, the depth offset along the view rays in camera space is determined using the vertex shader. In the second rendering pass, the vertex positions of the point polygon are also determined with the vertex shader. Due to the simple instruction set of the vertex shader, the implementation of the symmetric matrix decomposition in Equation (11) requires a careful design. We make most use of the two most powerful shader instructions, reciprocal square root (RSQ) and reciprocal (RCP). The details of these computation are described in Appendix A.
The constant part of Equation (13)
is output to the alpha channel of a diffuse color register. In this way, it can be accessed later by the pixel shader. Because vertex shaders do not support generating new vertices, we need to perform the same per vertex computation four times for each quadrilateral.
Pixel Shader Computations:
The determination of Wkρk(X) in Equation (2) is performed using the per-fragment processing of the pixel shader. The colors Wrk, Wgk, Wbk are retrieved from the red, green, and blue channel of the input register for the diffuse color. Multiplying the texel alpha value by the constant
which is stored in the diffuse color register of the alpha channel, yields ρk(X). Finally, the accumulation of the EWA splats in Equation (2) to form the output image 605 is performed by additive alpha blending 620, see
Hierarchical Rendering
In our preferred embodiment, we use a point-based layered depth cube (LDC) tree for hierarchical rendering. However, other hierarchical data structures, such as bounding sphere hierarchy can be used.
While traversing the LDC tree from the lowest to the highest resolution blocks, we perform view-frustum culling of blocks, and backface culling using visibility cones. To select the appropriate octree level to be projected, we perform block bounding box warping. This enables us to estimate the number of projected points per pixel, facilitating progressive rendering.
For efficient hardware acceleration, the points of multiple LDC tree blocks are stored together in a number of large vertex buffers. This minimizes the switching of vertex buffers during rendering and enhances performance. The vertex buffers are allocated in the local memory of the graphics card or in accelerated graphics port (AGP) memory. Their sizes are selected to be optimal for the graphics rendering engine and to maximize performance. To access the vertex data of a block in the LDC tree, we store the corresponding vertex buffer ID and the start position of its vertex data in the vertex buffer.
Pre-Processing
Due to the irregular sampling of point-based models and the truncation of the Gaussian kernel, the basis functions rk in object space do not form a partition of unity in general. Neither do the resampling kernels in image space. To enforce a partition of unity, we could perform per-pixel normalization in the frame buffer after splatting. However, this post-processing operation is not supported by current graphics hardware. In addition, directly locking and accessing the frame buffer during rendering for per-pixel normalization slows down the rendering speed. But without normalization, the brightness of the final image varies with the accumulated filter weights, leading to visible artifacts. To solve this problem, we provide a pre-processing normalization step.
Point Normalization
If the basis functions rk in Equation (1) sum up to one everywhere, then applying a low-pass filter still guarantees that the resampling filters in image space form a partition of unity. Consequently, our pre-processing does not consider the prefiltering step during rendering and becomes a view independent process. The normalized view independent texture function in object space could be written as follows:
Unfortunately, the above rational basis function {circumflex over (r)}k invalidates the derivation of a closed form resampling filter. Instead, we use the sum of the weights at each point to approximate the above formula, yielding
where
We call sk the point's normalization weight, which is acquired by a view independent process described below. Based on Equation (7), we adjust our object space EWA resampling filter with the weight sk, yielding:
ρ′k(u)=skGV
which is the resampling filter used by object space EWA surface splatting with per-point normalization.
Acquiring Point Normalization Weights
To acquire a point's normalization weight, the point-sampled model is first rendered using our two pass method without pre-filtering and per-point normalization. Then the Z-buffer and the frame buffer are read back to acquire the normalization weights. In the third pass, we traverse the LDC tree to determine the depth value and the projected position in image space of the center point of each polygon.
Based on the Z-buffer information, the visible points are detected. After rendering, the alpha channel of each frame buffer pixel stores the sum of the accumulated contributions S from all EWA splats projected to that pixel. Hence, the visible point's normalization weight is sk=1/S. To capture the normalization weights for points invisible from one view, multiple-view weight capture is applied, which can be performed automatically or interactively.
For automatic capture, a bounding sphere is constructed around the point-based model. Then point weights are acquired from different view points which are uniformly distributed on the surface of the sphere. For interactive capture, the user manually specifies a part of the point model to be acquired. In both methods, the normalization weight of the same point can be acquired several times. To get rid of noise, we select the median value as the final normalization weight.
Per-point normalization assumes that the normalization weight is the same in the small neighborhood covered by the point's polygon. For each point, the normalization weight captured at the center of the point quadrilateral is copied to its polygon vertices during rendering. The above assumption is not true, however, at the edges of the point model. In this case, we acquire the normalization weight for each vertex of the point polygon. Thus, point quadrilaterals at edges have different normalization weights for each vertex.
In the acquisition process, direct rendering of point-based models can cause overflow in the alpha channel of frame buffer pixels where the accumulation of contributions from different splats is greater than one. In this case, the point normalization weight is incorrectly computed due to clamping in the alpha channel. To solve the problem, we use a global parameter γ to avoid overflow. In our implementation, the weight capture process uses the following object space texture function:
By setting γ to a suitable value less than one, the accumulated contribution of the splats in a pixel is not too large to be clamped. Consequently, the image rendered during normalization weight capture is darker. A typical choice for γ=0.73, which works for most point models. For a normalization weight s′k, and a global parameter γ, the final point normalization weight is sk=s′kγ.
Effect of the Invention
Object space EWA surface splatting with per-point normalization is a desirable rendering method for high quality and interactive point rendering. It can handle several million points per second when object level culling is enabled. To improve the rendering quality further, we can combine per-point normalization and per-pixel normalization during progressive rendering. The point model is rendered by per-point normalization during user interaction, and refined by per-pixel normalization afterwards.
We also compare the performance of object space EWA surface splatting with a software implementation of image space EWA surface splatting. For a 512 output resolution, our method can render approximately 1.5 million antialiased splats per second. On the same PC, the software-only implementation of image space EWA surface splatting only renders up to 200,000 splats per second. The software renderer is also more sensitive to the output image resolution. When the image resolution is higher, the performance of the software method decreases linearly. In contrast, hardware accelerated object space EWA surface splatting is less sensitive to the output resolution.
The invention provides an object space formulation of EWA surface splatting for irregular point samples. Second, we provides a multi-pass approach to efficiently implement the method using vertex and pixel shaders of modem PC graphics hardware. We also provide a pre-processing method for proper normalization of the EWA splats.
Besides increased performance, there are other advantages of using GPUs for point based rendering. While CPUs double in speed every two years, GPUs increased their performance by a factor of 11 in the last nine months. Undoubtedly, GPU performance will continue to increase faster than CPU speed in the near future. Due to their fixed-function processing, there is more room for parallelization.
Because each point is processed independently, this will linearly increase the performance of our method. Furthermore, the performance of a software implementation of EWA surface splatting drops with increased output resolution, an effect that is not nearly as serious for our hardware based implementation. Finally, using the GPU leaves the CPU free for other tasks, such as coordinated audio processing.
Although the invention has been described by way of examples of preferred embodiments, it is to be understood that various other adaptations and modifications may be made within the spirit and scope of the invention. Therefore, it is the object of the appended claims to cover all such variations and modifications as come within the true spirit and scope of the invention.
Symmetric Matrix Decomposition for Vertex Shader Implementation
We choose the follwing symmetric matrix decomposition method for our vertex shader implementation. Mk is rewritten as follows:
Then we define
The following variables are stored in the vertex shader temporary registers:
p=A−C
q=A+C
t=Sgn(p)sqrt(p2+B2).
When those temporary variables, the scaling matrix can be computed as
Rot (θ) can be computed, too If t=0,
else if t≠0:
Square root and division operations in the above formulas can be computer efficiently using the vertex shader instructions “RSQ” and “RCP”, respectively.
Number | Name | Date | Kind |
---|---|---|---|
6396496 | Pfister et al. | May 2002 | B1 |
6639597 | Zwicker et al. | Oct 2003 | B1 |
6674430 | Kaufman et al. | Jan 2004 | B1 |
6744435 | Zwicker et al. | Jun 2004 | B2 |
Number | Date | Country | |
---|---|---|---|
20040012603 A1 | Jan 2004 | US |