1. Field of the Invention
The present invention generally relates to graphics processors, and more particularly, the present invention relates to a 3D graphics pipeline in which a depth test stage is placed early in the pipeline to minimize bandwidth and/or power consumption.
2. Description of the Related Art
Graphics engines have been utilized to display three-dimensional (3D) images on fixed display devices, such as computer and television screens. These engines are typically contained in desk top systems powered by conventional AC power outlets, and thus are not significantly constrained by power-consumption limitations. A recent trend, however, is to incorporate 3D graphics engines into battery powered hand-held devices. Examples of such devices include mobile phones and personal data assistants (PDAs). Unfortunately, however, conventional graphics engines consume large quantities of power and are thus not well-suited to these low-power operating environments.
In 3D graphic systems, each object to be displayed is typically divided into surface triangles defined by vertex information, although other primitive shapes can be utilized. Also typically, the graphics pipeline is designed to process sequential batches of triangles of an object or image. The triangles of any given batch may visually overlap one another within a given scene.
Referring to
The pixel shading stage 102 uses the vertex information to compute which pixels are encompassed by each triangle among a processed batch of triangles. Since the triangles may overlap one another, multiple pixels of differing depths may be located at the same point on a screen display. In particular, the pixel shading stage 101 interpolates the shading (lighting value), color and depth values for each pixel using the vertex information. Any of a variety of shading techniques can be adopted for this purpose, and shading operations can take place on per triangle, per vertex or per pixel bases.
The texture mapping stage 103 and texture blending stage 104 function to add and blend texture into each pixel of the process batch of triangles. Very generally, this is done by mapping pre-defined textures onto the pixels according to the vertex information. As with shading, a variety of techniques may be adopted to achieve texturing. Also, a technique known as fog processing may be implemented as well.
The scissor test stage 105 functions to discard pixels contained in portions (fragments) of triangles which fall outside the field of view of the displayed scene. Generally, this is done by determining whether pixels lie within a so-called scissor rectangle.
The alpha test unit 106 conditionally discards a fragment of a triangle (more precisely, pixels contained in the fragment) based on a comparison between an alpha value (transparency value) associated with the fragment and a reference alpha value. Similarly, the stencil test conditionally discards fragments based on a comparison between each fragments and a stored stencil value.
The depth test stage 108 (also called Hidden Surface Removal (HRS)) discards pixels contained in triangle fragments based on a depth value of the pixels and a depth value of other pixels having the same display location. Generally, this is done by comparing using a z-axis value (depth value) of a pixel undergoing the depth test with a z-axis value stored in a corresponding location of a so-called z-buffer or depth buffer. The tested pixel is discarded if the z-axis value thereof indicates that the pixel would be blocked from view by another pixel having the z-axis value stored in the z-buffer. On the other hand, the z-buffer value is overwritten with the z-axis value of the tested pixel in the case where the tested pixel would not be blocked from view. In this manner, underlying pixels which are blocked from view are discarded in favor of overlying pixels.
The alpha blending stage 109 combines rendered pixels with pixels previously stored in a color buffer based on alpha values to achieve transparency of an object.
The logical operations unit 110 generically denotes miscellaneous remaining processes of the pipeline for ultimately obtaining pixel display data.
In any graphics system, it is desired to conserve processor and memory bandwidth to the extent possible while maintaining satisfactory performance. This is especially true in the case of portable or hand-held devices where bandwidths may be limited. Also, as suggested previously, there is a particular demand in the industry to minimize power consumption when processing 3D graphics for display on portable or hand-held devices.
According to one aspect of the present invention, a graphics pipeline is provided for processing pixel data and includes a plurality of sequentially arranged processing stages which render display pixel data from input primitive object data, where the processing stages include at least a texturing stage and a depth test stage, and wherein the depth test stage is located earlier in the graphics pipeline than the texturing stage.
According to another aspect of the present invention, a graphics pipeline is provided for processing pixel data, and includes a plurality of sequentially arranged processing stages which render display data from input primitive object data. The processing stages include at least a texturing stage, an alpha test stage and a depth test stage. Further, the pipeline is dynamically reordered between at least first and second stage sequences according to an alpha test state of processed pixel data. In the first stage sequence, the depth test stage is functionally located earlier in the graphics pipeline than the texturing stage. In the second stage sequence, the depth test stage is functionally located after the texturing stage and the alpha test stage.
According to still another aspect of the present invention, graphics pipeline for processing pixel data, and includes a depth buffer which stores depth values, and a depth test stage which compares a current depth value of a processed pixel with a previous depth value stored in the depth buffer, and which issues a write command to overwrite the previous depth value with the current depth value based on a comparison result. The graphics processor further includes write defer circuitry which temporarily defers the write command issued by depth test stage, a texturing stage which receives the processed pixel after processing by the depth test stage, and an alpha test stage which receives the processed pixel after processing by the texturing stage. The write defer circuitry is responsive to the alpha test stage to either ignore or execute the deferred write command issued by the depth test stage.
The above and other aspects of the invention will become readily apparent from the detailed description that follows, with reference to the accompanying drawings, in which:
The present invention is at least partially characterized by placing the depth test stage early in the graphics pipeline to minimize power and bandwidth consumption of later pipeline stages. The depth test functions to discard pixels which would not be visible because they are hidden from view by overlying pixels. Thus, by moving the depth test to an early position in the pipeline, hidden pixels are discarded in advance of processing by later high bandwidth and high power-consuming pipeline stages. As such, pipeline resources are not expended on the discarded pixels.
The present invention is also at least partially characterized by optionally accommodating alpha testing while positioning the depth test early in the 3D graphics pipeline. This may be done by dynamically reordering the pipeline depending on whether alpha testing has been enabled, or by deferring writing of the depth test results until the outcome of alpha testing can be established.
The present invention will now be described by way of several preferred but non-limiting embodiments. The 3D graphics pipelines described below are for rendering display pixel data from input primitive object data and may be incorporated in appropriately configured 3D graphics engines.
The operations of the respective pipeline stages shown in
According to the configuration of
The embodiment of the example of
The pixel operations of the respective pipeline stages shown in
The operational sequence of stages of the pipeline of
In the case where alpha testing is enabled for a pixel, the stages progress as shown by reference “b” of
Generally, a graphics pipeline includes control bits which are shifted down the pipeline together with the pixel data. One of those control bits is the alpha test bit. Also generally, when a batch of triangle is applied to the pipeline, each triangle of the batch will have the same alpha test setting.
Assume that alpha testing is initially disabled, and accordingly, the pipeline of
The pixel operations of the respective pipeline stages shown in
The condition “AT=0” of
When AT≠0, alpha testing is enabled for the pixels being processed. In that case, multiplexer 404 selects the output from the alpha test stage 409 and applies the same to the depth test stage 405; multiplexer 406 selects the output from the scissor test stage 403 and applies the same to the texture mapping stage 407; and multiplexer 410 selects the output from depth test stage 405 and applies the same to the alpha blending stage 411. Thus, when AT≠0, the pipeline sequence is as follows: triangle setup stage 401→pixel shading stage 402→scissor test stage 403→texture mapping stage 407→texture blending stage 408→alpha test stage 409→depth test stage 405→alpha blending stage 411→logical operations stage 412.
In each of the embodiments of
Illustrated in
In operation, a processed pixel arrives via the pipeline to the depth test stage 501, and the z-axis value (depth value) of the pixel is compared with a z-axis value stored in a corresponding location of the depth buffer 507. This is done by transmitting a read address [addr_r(14:0)] to the depth buffer 507 via the buffer interface 506, and receiving a depth buffer z-axis value [depth_r(15:0)] stored in the depth buffer 507. The z-axis value of the pixel and the z-axis value of the depth buffer are compared, and if the comparison result indicates that the pixel would not be visible, the pixel is effectively discard. On the other hand, if the comparison result suggests that the pixel would be visible, a deferred buffer write process is executed as described below.
Assuming the FIFO circuit 505 is not full as indicated by the signal [fifo_full], the deferred buffer write process is carried out by issuing a FIFO write command [fifo_write], and then writing a buffer write address signal [addr_w(14:0)], a new pixel z-axis value [depth_w(15:0)], and an alpha test signal [alpha_test] to the FIFO circuit 505. The buffer write address signal [addr_w(14:0)] is indicative of the corresponding location of the depth buffer 507 of the processed pixel. The new pixel z-axis value [depth_w(15:0)] is the z-axis value of the processed pixel (which has passed the depth test). The alpha test signal [alpha_test] indicates whether alpha testing has been enabled for the processed pixel.
As the buffer write address signal [addr_w(14:0)], the new pixel z-axis value [depth_w(15:0)], and the alpha test signal [alpha_test] are shifted through the FIFO circuit 505, the processed pixel is simultaneously subjected to texture mapping and texturing by the texturing mapping stage 502 and the texturing stage 503, respectively. The depth “n” of the FIFO circuit 505 is preferably equal to the sum of the pixel capacities of the pipeline stages interposed between the depth test stage 501 and the alpha test stage 504. In this embodiment, the depth of the FIFO circuit 505 is equal to the sum of the pixel capacities of the texture mapping stage 502 and the texturing stage 503. As a result, a pixel will have worked its way to the end of the FIFO circuit 505 at a timing which is coincident with the processing of the pixel by the texture mapping stage 502 and the texturing stage 503.
After texture mapping and texturing, the processed pixel is then applied to the alpha test stage 504. At this time, the pixel is either alpha test enabled or not alpha test enabled.
If the pixel is not alpha test enabled, the pixel is transmitted down the pipeline for further processing, and a [valid_pixel] signal is transmitted to the depth buffer interface 506. Assuming the FIFO circuit 505 is not empty as indicated by the signal [fifo_empty], the depth buffer interface is responsive to the [valid_pixel] signal to issue a read command [fifo_read] to the FIFO circuit 505, and to read the buffer write address signal [addr_w(14:0)], the new pixel z-axis value [depth_w(15:0)], and the alpha test signal [alpha_test]. The depth buffer interface 506 then updates the depth buffer 507 by addressing the depth buffer 507 at the address [addr_w(14:0)], and overwriting the new pixel z-axis value [depth_w(15:0)] into the depth buffer 507.
If the pixel is alpha test enabled, the alpha test stage 504 compares the alpha value of the process pixel with a reference value. If the pixel passes the alpha test, the [alpha_pass] signal is transmitted to the depth buffer interface 506, and the pixel is transmitted down the pipeline for further processing. If the pixel fails the alpha test, the [alpha_fail] signal is transmitted to the depth buffer interface 506, and the pixel is effectively discarded.
Again, assuming the FIFO circuit 505 is not empty as indicated by the signal [fifo_empty], the depth buffer interface is responsive to the [alpha_pass] and [alpha_fail] signals to issue a read command [fifo_read] to the FIFO circuit 505, and to read the buffer write address signal [addr_w(14:0)], the new pixel z-axis value [depth_w(15:0)], and the alpha enable signal [alpha_test]. If the [alpha_fail] signal is active, the depth buffer interface 506 does not update the depth buffer 507 with the new pixel z-axis value [depth_w(15:0)]. On the other hand, if the [alpha_pass] signal is active, the depth buffer interface 506 updates the depth buffer 507. That is, the depth buffer interface addresses the depth buffer 507 at the address [addr_w(14:0)], and overwrites the new pixel z-axis value [depth_w(15:0)] into the depth buffer 507.
The implementation described above in connection with
By deferring the updating of the depth buffer pending the outcome of alpha testing as in the embodiment of
In the drawings and specification, there have been disclosed typical preferred embodiments of this invention and, although specific examples are set forth, they are used in a generic and descriptive sense only and not for purposes of limitation. It should therefore be understood the scope of the present invention is to be construed by the appended claims, and not by the exemplary embodiments.
A claim of priority is made to U.S. provisional application Ser. No. 60/550,018, filed Mar. 3, 2004, and to U.S. provisional application Ser. No. 60/550,024, filed Mar. 3, 2004, the entire contents of which are incorporated herein by reference.
Number | Date | Country | |
---|---|---|---|
60550018 | Mar 2004 | US | |
60550024 | Mar 2004 | US |