The present invention relates to graphics processing and more specifically to the removal of non-visible render objects prior to rendering.
In a typical graphics processing system, inefficiencies arise based on the rendering of graphic elements, such as pixels, which are not visible to an end user. As the resolution of a graphical display increases, the amount of specific graphics rendering also thereby increases. Therefore, to reduce the amount of processing overhead there exist techniques for eliminating rendering elements prior to being processed by a graphics processing pipeline.
For example, one technique is the operation of a hierarchical Z buffering technique whereupon a rendering element is compared in a depth test relative to other rendering elements within a display screen. Another operating technique is the determination if a rendering element falls within a view frustum such that it would be visible within the boundaries of the graphical output.
A typical graphics processing system would provide for culling decisions to be made based on graphical hardware and a central processing unit (CPU). Prior art systems utilized a CPU-based bounding system which defines areas such as the view frustum in the CPU. Then these systems perform a test to determine if a draw packet, such as a plurality of pixels, is rendered as a function of a depth test or other visibility determination. Although, prior solutions require the rasterization of pixels to determine a Z occlusion of pixels for the depth determination. For example, a wall having a visible characteristic on it which may be visible through a doorway in a graphic output, prior technique systems require portals to determine the visibility in the other room. Typically, the CPU is unable to detect a divider with an object behind it relative to the defined viewing portals.
Therefore, in prior graphics rendering systems, culling decisions are difficult to make based on a required synchronization between the central processing unit and the associated hardware to determine free computed factors for making further visibility determinations. For example, the central processing unit would require a feedback from the hardware with regards to defined parameters for a viewing portal to determine whether drawing packets having a depth beyond the portal are visible and worth rendering or should be culled from the rendering pipeline.
Therefore, there exists a need for a graphics processing system which allows for object-based visibility culling.
Generally, the present invention includes the method and apparatus for object-based visibility culling, including the steps of receiving a plurality of draw packets. As discussed above, a draw packet may be a plurality of rendering elements, such as pixels, vertices, or any other suitable rendering element as recognized by one having ordinary skill in the art. The method and apparatus further includes comparing each of the plurality of draw packets to a bounding volume object, wherein the bounding volume object may be a low resolution geometric representation of a specific object, such as a window, doorway, or any other suitable portal through which viewing definitions may be defined. Whereupon, for each of the plurality of draw packets, if the draw packet is deemed potentially visible, setting a visibility query identifier and rendering the draw packets having the set visibility query identifier. In one embodiment, the visibility query identifier may be a single or multi bit indicator which indicates that the draw packet has been deemed potentially visible and therefore, warranting further rendering within a processing pipeline.
More specifically,
In the embodiment of
It is also noted,
In one embodiment, the graphics processing unit 100 determines, based on the results of, among other things, back-face culling, view frustum determination, and user-clip plane discard and hierarchical Z discard, if any pixels are potentially modified by the geometry between the begin/end of the visibility query. The determination resulting from step 154 is a not-visible/potentially visible determination and step 154 does not provide a succinct indication of whether a draw packet will in fact be rendered visible, but only rather a determination if any draw packet is specifically not visible due to some occlusion.
Therefore, the next step, step 156, of the method is for each of the plurality draw packets, if the draw packet is deemed potentially visible, setting a visibility query identifier. The next step, step 158, is then rendering the draw packet having the set visibility query identifier. As discussed with regards to
Whereupon, for each of the plurality of draw packets, if the draw packet is deemed potentially visible, the method includes setting a visibility query identifier, step 178, similar to step 156 of
In one embodiment, there may be up to 64 independent visible query status values to allow multiple visible query geometries to be drawn. The noted 64 independent visible query status values are for exemplary purposes only and that any suitable number of independent visible queric may be utilized. In the present invention, there exists a potential internal latency of a pre-determined number of core clock cycles, to allow the visibility query geometry to finish past the hierarchical Z discard before the not-visible status can be determined. Therefore, if a conditional rendering packet, such as a draw packet is received before the corresponding visible query geometry, the CP will wait until the visibility query results have been returned to continue processing. Therefore, by providing for a multiple number of independent visible query status values, this may seek to hide the internal latency. In one embodiment, the graphics processing unit 100 of
In one embodiment, a driver, which may be implemented in software operating on a processor, hardware, or any combination thereof, sets the VIZ_QUERY_ENABLE bit and the VIZ_QUERY-ID field using a set underscore state and/or incremental updates to these states. The driver may send a VIZ_QUERY_BEGIN_PKT which contains the VIZ_QUERY_ID upon processing a begin visibility query. Moreover, the driver may send a VIZ_QUERY_END_PKT which contains the VIZ_QUERY_ID upon processing the end visibility query. Furthermore, the driver may set up a modified DRAW_INDX packet, which will include a USER_QUERY_RESULT with the VIZ_QUERY_ID.
As there are multiple query results and the query results may span multiple draw commands, the driver manages the VIZ_QUERY_IDs across multiple driver contexts. Whereupon, in one embodiment is provided shared resources which can be called by the individual driver context to allocate and de-allocate from a common pool of QUERY_IDs. If the pool is empty, then a null QUERY_ID will be returned indicating that the VIZ_QUERY is not currently available. Furthermore, as the VIZ_QUERY begin/end may span multiple draw packets, it may further span driver context switches. Therefore, the driver includes the VIZ_QUERY_ENABLE in a command preamble. If the VIZ_QUERY_ENABLE is set, then the VIZ_QUERY_ID must also be included in the preamble.
Referring back to the CP 108 of
In one embodiment, when the VIZQ_END flag is set, the CP 108 sets the corresponding END_RCVD bit, which will stall the next visibility query begin command until the status of the current visibility query command is received from the SC 110. Furthermore, the CP 108 created a visibility query end event, including writing the VGT_EVENT_INITIATOR with the corresponding identifier to a processor, such as the VGT 112. Thereupon, the visibility results are sent back to the CP 108 through the dedicated interface 120 from the SC 110 such that the CP 108 clears the corresponding END_RCVD bit for the visibility query and sets the DISCARD bit to the value provided by the SC 110.
In the event the draw packet is determined to be potentially visible, the DISCARD BIT is cleared and the CP 108 issues a visible query begin event, in one embodiment, writing a VGT_EVENT_INITIATOR register with an EVENT_ID. Furthermore, the SC 110 resets its visibility results for the associated visible query draw packet. For a VIZ_QUERY_END packet, the CP 108, in one embodiment, sets a corresponding END_RCVD bit for that ID. Thereupon, this stalls the next visible query begin packet until the visibility status is returned from the SC 110. The visibility results are sent back to the CP 108 from the SC 110 via, in one embodiment, a dedicated interface, such as connection 120 of
Furthermore, in one embodiment, the SC 110 uses the VIZ_QUERY_ENABLE and VIZ_QUERY_ID that are within a state sub-block. The SC 110 maintains an internal set of visible bits, one bit for each of the 64 VIZ_QUERIES in this embodiment. Moreover, the visible bits may be read/write accessible via a memory map register, not illustrated in
Step 208, the CP 108 sends a VIZ_QUERY_BEGIN command to clear the SC_VISIBLE_X bit. Driver B 106 sets a VIZ_QUERY_ENABLE and VIZ_QUERY_ID bit equal to a value Y, step 210. Step 212, driver B 106 submits a VIZ_QUERY_BEGIN to the command processor 108. Thereupon, step 214, the command processor sets DISCARD_Y bit to a zero value and END_RCVD_Y bit value to a zero.
The command processor 108 sends the VIZ_QUERY_BEGIN command to clear the SC_VISIBLE_Y bit within the scan converter 110, step 216. At that point, step 218, driver B 106 submits a plurality of draw packets 102. Step 220, the scan converter 110 performs visibility testing and updates SC_VISIBLE_X if any tiles, draw packets, relative to the visibility query for draw packets X, are deemed visible.
Driver A 104 thereupon sets a VIZ_QUERY_ENABLE and a VIZ_QUERY_ID bit to be equivalent to the value X, step 222. The command processor 108 sets an END_RCVD_X bit and creates a VIZ_QUERY_END event, step 224. Step 226, the scan converter 110 receives the VIZ_QUERY_END packet and sends results to the command processor 108.
The command processor discards only non-visible draw packets, step 228. Driver B thereupon sets a VIZ_QUERY_ENABLE and a VIZ_QUERY_ID value equal to the value Y, step 230. Driver B submits a plurality of draw packets relative to the associated ID Y, step 232. The scan converter 110 performs visibility testing and updates the SC_VISIBLE_Y value to determine if any tiles, draw packets, are visible relative to the bounding volume object, step 234.
The command processor 108 thereupon sets and END_RCVD_Y bit and creates a VIZ_QUERY_EVENT command, step 236. Step 238, the scan converter 238 receives the VIZ_QUERY_END packet cross dedicated connection 120 and sends the results to the command processor 108. Thereupon, the command processor 108 discards only non-visible draw packets, step 240. As such, the method is complete, step 242.
As further noted, the command processor 108 may further provide for the rendering of the draw packets which have been deemed potentially visible, having a SC_VISIBLE bit set based on the operations of the scan converter relative to the bounding volume object.
As such, the present invention provides for graphics processing by the effective utilization of object based visibility culling by determining which draw packets are definitely not visible relative to a volume bounding volume object. Through the utilization of the command processor 108 and the scan converter 110 relative to at least one driver, such as drivers 104 and/or 106, operations may be performed to provide for an early determination and effective culling of draw packets, which are deemed not visible. Moreover, the command processor 108 performs a further comparison step for only rendering draw packets which have been determined through a visibility query to be potentially visible.
It should be understood that the implementation of other variations and modifications of the invention in its various aspects will be apparent to those of ordinary skill in the art, and that the invention is not limited by the specific embodiments described herein. For example, the graphics processing unit, the command processor 108, the scan converter 110 and the drivers may be disposed on one or more processors executing executable instructions. Moreover, the scan converter 110 may further provide for further coupling to memory devices for storing further culling based information to provide for a greater degree of determination of non-visible draw packets. It is therefore contemplated to cover by the present invention, any and all modifications, variations, or equivalents that fall within the spirit and scope of the basic underlying principles disclosed and claimed herein.