Method and system for implementing fragment operation processing across a graphics bus interconnect

Description

FIELD OF THE INVENTION

The present invention is generally related to graphics computer systems.

BACKGROUND OF THE INVENTION

Generally, a computer system suited to handle 3D image data includes a specialized graphics processor unit, or GPU, in addition to a traditional CPU (central processing unit). The GPU includes specialized hardware configured to handle 3D computer-generated objects. The GPU is configured to operate on a set of data models and their constituent “primitives” (usually mathematically described polygons) that define the shapes, positions, and attributes of the objects. The hardware of the GPU processes the objects, implementing the calculations required to produce realistic 3D images on a display of the computer system.

The performance of a typical graphics rendering process is highly dependent upon the performance of the system's underlying hardware. High performance real-time graphics rendering requires high data transfer bandwidth to the memory storing the 3D object data and the constituent primitives. Thus, more expensive prior art GPU subsystems (e.g., GPU equipped graphics cards) typically include larger (e.g., 128 MB or larger) specialized, expensive, high bandwidth local graphics memories for feeding the required data to the GPU. Less expensive prior art GPU subsystems include smaller (e.g., 64 MB or less) such local graphics memories, and some of the least expensive GPU subsystems have no local graphics memory.

A problem with the prior art low-cost GPU subsystems (e.g., having smaller amounts of local graphics memory) is the fact that the data transfer bandwidth to the system memory, or main memory, of a computer system is much less than the data transfer bandwidth to the local graphics memory. Typical GPUs with any amount of local graphics memory need to read command streams and scene descriptions from system memory. A GPU subsystem with a small or absent local graphics memory also needs to communicate with system memory in order to access and update pixel data including pixels representing images which the GPU is constructing. This communication occurs across a graphics bus, or the bus that connects the graphics subsystem to the CPU and system memory.

In one example, per-pixel Z-depth data is read across the system bus and compared with a computed value for each pixel to be rendered. For all pixels which have a computed Z value less than the Z value read from system memory, the computed Z value and the computed pixel color value are written to system memory. In another example, pixel colors are read from system memory and blended with computed pixel colors to produce translucency effects before being written to system memory. Higher resolution images (images with a greater number of pixels) require more system memory bandwidth to render. Images representing larger numbers of 3D objects require more system memory bandwidth to render. The low data transfer bandwidth of the graphics bus acts as a bottleneck on overall graphics rendering performance.

Thus, what is required is a solution capable of reducing the limitations imposed by the limited data transfer bandwidth of a graphics bus of a computer system. What is required is a solution that ameliorates the bottleneck imposed by the much smaller data transfer bandwidth of the graphics bus in comparison to the data transfer bandwidth of the GPU to local graphics memory. The present invention provides a novel solution to the above requirement.

SUMMARY OF THE INVENTION

Embodiments of the present invention ameliorate the bottleneck imposed by the much smaller data transfer bandwidth of the graphics bus in comparison to the data transfer bandwidth of the GPU to local graphics memory.

In one embodiment, the present invention is implemented as a system for cooperative graphics processing across a graphics bus in a computer system. The system includes a bridge coupled to a system memory via a system memory bus. The bridge is also coupled to a GPU (graphics processor unit) via the graphics bus. The bridge includes a fragment processor for implementing cooperative graphics processing with the GPU coupled to the graphics bus. The fragment processor is configured to implement a plurality of raster operations on graphics data stored in the system memory. The graphics bus interconnect coupling the bridge to the GPU to can be an AGP-based interconnect or a PCI Express-based interconnect. The GPU can be an add-in card-based GPU or can be a discrete integrated circuit device mounted (e.g., surface mounted, etc.) on the same printed circuit board (e.g., motherboard) as the bridge.

In one embodiment, the graphics data stored in the system memory comprises a frame buffer used by both the fragment processor and the GPU. One mode of cooperative graphics processing involves the fragment processor implementing frame buffer blending on the graphics data in the system memory (e.g., the frame buffer). In one embodiment, the fragment processor is configured to implement multi-sample expansion on graphics data received from the GPU and store the resulting expanded data in the system memory frame buffer. In one embodiment, the fragment processor is configured to evaluate a Z plane equation coverage value for a plurality of pixels (e.g., per polygon) stored in the system memory, wherein the Z plane equation coverage value is received from the GPU via the graphics bus.

In this manner, embodiments of the present invention implement a much more efficient use of the limited data transfer bandwidth of the graphics bus interconnect, and thus dramatically improve overall graphics rendering performance in comparison to the prior art. Furthermore, the benefits provided by the embodiments of the present invention are even more evident in those architectures which primarily utilize system memory for frame buffer graphics data storage.

BRIEF DESCRIPTION OF THE DRAWINGS

The present invention is illustrated by way of example, and not by way of limitation, in the figures of the accompanying drawings and in which like reference numerals refer to similar elements.

FIG. 1 shows a computer system in accordance with one embodiment of the present invention.

FIG. 2 shows a diagram depicting fragment operation processing as implemented by a computer system in accordance with one embodiment of the present invention.

FIG. 3 shows a diagram depicting fragment processing operations executed by the fragment processor and the rendering operations executed by the GPU within a cooperative graphics rendering process in accordance with one embodiment of the present invention.

FIG. 4 shows a diagram depicting information that is transferred from the GPU to the fragment processor and to the frame buffer in accordance with one embodiment of the present invention.

FIG. 5 shows a flowchart of the steps of a cooperative graphics rendering process in accordance with one embodiment of the present invention.

DETAILED DESCRIPTION OF THE INVENTION

Reference will now be made in detail to the preferred embodiments of the present invention, examples of which are illustrated in the accompanying drawings. While the invention will be described in conjunction with the preferred embodiments, it will be understood that they are not intended to limit the invention to these embodiments. On the contrary, the invention is intended to cover alternatives, modifications and equivalents, which may be included within the spirit and scope of the invention as defined by the appended claims. Furthermore, in the following detailed description of embodiments of the present invention, numerous specific details are set forth in order to provide a thorough understanding of the present invention. However, it will be recognized by one of ordinary skill in the art that the present invention may be practiced without these specific details. In other instances, well-known methods, procedures, components, and circuits have not been described in detail as not to unnecessarily obscure aspects of the embodiments of the present invention.

Notation and Nomenclature:

Some portions of the detailed descriptions, which follow, are presented in terms of procedures, steps, logic blocks, processing, and other symbolic representations of operations on data bits within a computer memory. These descriptions and representations are the means used by those skilled in the data processing arts to most effectively convey the substance of their work to others skilled in the art. A procedure, computer executed step, logic block, process, etc., is here, and generally, conceived to be a self-consistent sequence of steps or instructions leading to a desired result. The steps are those requiring physical manipulations of physical quantities. Usually, though not necessarily, these quantities take the form of electrical or magnetic signals capable of being stored, transferred, combined, compared, and otherwise manipulated in a computer system. It has proven convenient at times, principally for reasons of common usage, to refer to these signals as bits, values, elements, symbols, characters, terms, numbers, or the like.

It should be borne in mind, however, that all of these and similar terms are to be associated with the appropriate physical quantities and are merely convenient labels applied to these quantities. Unless specifically stated otherwise as apparent from the following discussions, it is appreciated that throughout the present invention, discussions utilizing terms such as “processing” or “accessing” or “executing” or “storing” or “rendering” or the like, refer to the action and processes of a computer system (e.g., computer system 100 of FIG. 1), or similar electronic computing device, that manipulates and transforms data represented as physical (electronic) quantities within the computer system's registers and memories into other data similarly represented as physical quantities within the computer system memories or registers or other such information storage, transmission or display devices.

Computer System Platform:

FIG. 1 shows a computer system 100 in accordance with one embodiment of the present invention. Computer system 100 depicts the components of a basic computer system in accordance with embodiments of the present invention providing the execution platform for certain hardware-based and software-based functionality. In general, computer system 100 comprises at least one CPU 101, a system memory 115, and at least one graphics processor unit (GPU) 110. The CPU 101 can be coupled to the system memory 115 via the bridge component 105 or can be directly coupled to the system memory 115 via a memory controller internal to the CPU 101. The GPU 110 is coupled to a display 112. System 100 can be implemented as, for example, a desktop computer system or server computer system, having a powerful general-purpose CPU 101 coupled to a dedicated graphics rendering GPU 110. In such an embodiment, components would be included that are designed to add peripheral buses, specialized graphics memory, JO devices (e.g., disk drive 112), and the like. The bridge component 105 also supports expansion buses coupling the disk drive 112.

It should be appreciated that although the GPU 110 is depicted in FIG. 1 as a discrete component, the GPU 110 can be implemented as a discrete graphics card designed to couple to the computer system via a graphics bus connection (e.g., AGP, PCI Express, etc.), as a discrete integrated circuit die (e.g., mounted directly on the motherboard), or as an integrated GPU included within the integrated circuit die of a computer system chipset component (e.g., integrated within the bridge chip 105). Additionally, a local graphics memory 111 can optionally be included for the GPU 110 for high bandwidth graphics data storage. It also should be noted that although the bridge component 105 is depicted as a discrete component, the bridge component 105 can be implemented as an integrated controller within a different component (e.g., within the CPU 101, GPU 110, etc.) of the computer system 100. Similarly, system 100 can be implemented as a set-top video game console device such as, for example, the Xbox®, available from Microsoft Corporation of Redmond, Wash.

EMBODIMENTS OF THE PRESENT INVENTION

Referring still to FIG. 1, embodiments of the present invention reduce constraints imposed by the limited data transfer bandwidth of a graphics bus (e.g., graphics bus 120) of a computer system. Embodiments of the present invention ameliorate the bottleneck imposed by the much smaller data transfer bandwidth of the graphics bus 120 in comparison to the data transfer bandwidth of the system memory bus 121 to system memory 115. This is accomplished in part by the bridge 105 implemented cooperative graphics processing in conjunction with the GPU 110 to reduce the amount data that must be transferred across the graphics bus 120 during graphics rendering operations. As shown FIG. 1, the bridge component 105 is a core logic chipset component that provides core logic functions for the computer system 100.

The cooperative graphics processing reduces the total amount data that must be transferred across the bandwidth constrained graphics bus 120. By performing certain graphics rendering operations within the bridge component 105, the comparatively high bandwidth system memory bus 121 can be used to access to graphics data 116, reducing the amount of data access latency experienced by these rendering operations. The resulting reduction in access latency, and increase in transfer bandwidth, allows the overall graphics rendering operations to proceed more efficiently, thereby increasing the performance of bandwidth-demanding 3D rendering applications. This cooperative graphics rendering process is described in further detail in FIG. 2 below.

FIG. 2 shows a diagram depicting a fragment operation processing process in accordance with one embodiment of the present invention. As depicted in FIG. 2, the GPU 110 is coupled to the bridge 105 via the low bandwidth graphics bus 120. The bridge 105 is further coupled to the system memory 115 via the high bandwidth system memory bus 121. FIG. 2 depicts a configuration whereby system memory 115 is used as frame buffer memory (e.g., graphics data 116) for the computer system (as opposed to a local graphics memory). The bridge 105 is configured to access to graphics data 116 via the high bandwidth system memory bus 121. The bridge 105 is also coupled to the GPU 110 via the graphics bus 120.

The bridge 105 includes a fragment processor 201 for implementing cooperative graphics processing with the GPU 110. The fragment processor 201 is configured to implement a plurality of raster operations on graphics data stored in the system memory. These raster operations executed by the fragment processor 201 suffer a much lower degree of latency in comparison to raster operations performed by the GPU 110. This is due to both the higher data transfer bandwidth of the system memory bus 121 and the shorter communications path (e.g., lower data access latency) between the fragment processor 201 and the graphics data 116 within the system memory 115.

Performing a portion of the raster operations, or all of the raster operations, required for graphics rendering in the fragment processor 201 reduces the amount of graphics data accesses (e.g., both reads and writes) that must be performed by the GPU 110. For example, by implementing fragment operations within the fragment processor 201, accesses to fragment data (e.g., graphics data 116) required for iterating fragment colors across multiple pixels can be performed across the high bandwidth system memory bus 121. For example, fragment data can be accessed, iterated across multiple pixels, and the resulting pixel color values can be stored back into the system memory 115 all across the high bandwidth system memory bus 121. The interpolation and iteration functions can be executed by the fragment processor 201 in conjunction with an internal RAM 215. Such fragment processing operations comprise a significant portion of the rendering accesses to the graphics data 116. Implementing them using a fragment processor within the bridge will effectively remove such traffic from the low bandwidth graphics bus 120.

In one embodiment, the fragment processor 201 substantially replaces the functionality of the fragment processor 205 within the GPU 110. In this manner, the incorporation of the fragment processor 201 renders the fragment processor 205 within the GPU 110 optional. For example, in one embodiment, the GPU 110 is an off-the-shelf card-based GPU that is detachably connected to the computer system 100 via a graphics bus interconnect slot (e.g., AGP slot, PCI Express slot, etc.). Such an off-the-shelf card-based GPU would typically incorporate its own one or more fragment processors for use in those systems having a conventional prior art type bridge. The graphics bus interconnect can be an AGP-based interconnect or a PCI Express-based interconnect. The GPU 110 can be an add-in card-based GPU or can be a discrete integrated circuit device mounted (e.g., surface mounted, etc.) on the same printed circuit board (e.g., motherboard) as the bridge 105. When connected to the bridge 105 of the computer system 100 embodiment, the included fragment processor(s) can be disabled (e.g., by the graphics driver).

Alternatively, the GPU 110 can be configured specifically for use with a bridge component having an internal fragment processor (e.g., fragment processor 201). Such a configuration provides advantages in that the GPU integrated circuit die area that would otherwise be dedicated to an internal fragment processor can be saved (e.g., thereby reducing GPU costs) or used for other purposes. In this manner, the inclusion of a fragment processor 205 within the GPU 110 is optional.

In one embodiment, the internal fragment processor 205 within the GPU 110 can be used by the graphics driver in conjunction with the fragment processor 201 within the bridge 105 to implement concurrent raster operations within both components. In such an embodiment, the graphics driver would allocate some portion of fragment processing to the fragment processor 201 and the remainder to the fragment processor 205. The graphics driver would balance the processing workloads between the bridge component 105 and the GPU 110 to best utilize the high bandwidth low latency connection of the bridge component 105 to the system memory 115. For example, to best utilize the high bandwidth system memory bus 121, it would be preferable to implement as large a share as possible of the fragment processing workloads within the bridge component 105 (e.g., fragment processor 201). This would ensure as large a percentage of the fragment operations as is practical are implemented using the low latency high bandwidth system memory bus 121. The remaining fragment processing workloads would be allocated to the fragment processor 205 of the GPU 110.

Implementing fragment processing operations within the bridge component 105 provides an additional benefit in that the amount of integrated circuit die area within the GPU 110 that must be dedicated to “bookkeeping” can be reduced. Bookkeeping logic is used by conventional GPUs to keep track of accesses to the graphics data 116 that are “in-flight”. Such in-flight data accesses are used to hide the latency the GPU 110 experiences when reading or writing to the graphics data 116 across the low bandwidth graphics bus 120. In general, in-flight data accesses refer to a queued number of data reads or data writes that are issued to the graphics data 116 that have been initiated, but whose results have yet to be received.

Bookkeeping logic is used to keep track of such in-flight accesses and, for example, to make sure storage is on hand when read results from the graphics data 116 arrive and to ensure the graphics data 116 is not corrupted when multiple writes have been issued. The more complex the bookkeeping logic, the more in-flight data accesses the GPU can maintain, and thus, the more the effects of the high latency can be hidden. By offloading fragment processing operations to the bridge 105 (e.g., fragment processor 201), the demands placed on any bookkeeping logic within the GPU 110 is reduced.

In this manner, embodiments of the present invention implement a much more efficient use of the limited data transfer bandwidth of the graphics bus interconnect, and thus greatly improves overall graphics rendering performance in comparison to the prior art architectures. Furthermore, the benefits provided by the embodiments of the present invention are even more evident in those architectures which primarily utilize system memory for frame buffer graphics data storage.

FIG. 3 shows a diagram depicting fragment processing operations executed by the fragment processor 201 and the rendering operations executed by the GPU 110 within a cooperative graphics rendering process in accordance with one embodiment of the present invention. As depicted in FIG. 3, the fragment processor 201 implements its accesses to the frame buffer 310 via the high bandwidth system bus 121. The GPU 110 implements its accesses to the frame buffer 310 via the low bandwidth graphics bus 120.

The FIG. 3 embodiment shows the functions included within the fragment processor 201. As shown in FIG. 3, the fragment processor includes those raster operations such as frame buffer blending 321 and Z buffer operations, and other operations such as compression, and multi-sample expansion. For example, the frame buffer blending module 321 blends fragment colors into the pixel data of the frame buffer 310. Generally, this involves interpolating existing pixel colors with fragment colors and iterating those resulting values across the multiple pixels of a polygon. Such frame buffer blending involves a large number of reads and writes to the frame buffer 310 and thus directly benefits from the high bandwidth and low latency afforded by the system memory bus 121.

The Z buffer blending module 322 evaluates depth information per fragment of a polygon and iterates the resulting depth information across the multiple pixels. As with color blending, Z buffer blending involves a large number of reads and writes to the frame buffer 310 and similarly benefits from the high bandwidth and low latency of the system memory bus 121.

In one embodiment, the fragment processor 201 is configured to use a Z plane equation coverage value to iterate depth information across multiple pixels of a polygon. In such an embodiment, the depth and orientation of a polygon in 3-D space is defined using a Z plane equation. The Z plane equation is used by the fragment processor 201 to determine depth information for each constituent pixel covered by the polygon, and is a much more compact method of describing depth information for a polygon than by using a list of Z values for each fragment of the polygon. Additional description of Z plane raster operations can be found in commonly assigned U.S. Patent Application “Z PLANE ROP” by Steve Molnar, filed on Jun. 28, 2004, Ser. No. 10/878,460, which is incorporated herein in its entirety.

The compression module 323 compresses and decompresses per pixel data for storage and retrieval from the frame buffer 310. For example, in some rendering operations a given pixel can have multiple value samples, with a number of bits per sample. For example, in a case where each pixel of a display includes 8 sample points, the compression module 323 would compress/decompress the data describing such sample points for easier access to and from the frame buffer 310.

The multi sample expansion module 324 performs multi sample expansion operations on the fragment data. For example, depending upon the application (e.g., anti-aliasing) the sample expansion module 324 can expand sample information from one sample point per pixel into eight sample points per pixel. Thus it is desirable to perform the sample expansion in the fragment processor 201 for storage into the frame buffer 310 as supposed to the GPU 110.

Referring still to FIG. 3, in the present embodiment, texture operations and lighting operations are still performed by the GPU 110 (e.g., module 331 and module 332). The texture operations and lighting operations can proceed much more quickly since the relatively limited bandwidth of the graphics bus 120 is free from the traffic that has been moved to the fragment processor 201.

FIG. 4 shows a diagram depicting information that is transferred from the GPU 110 to the fragment processor 201 and to the frame buffer 310 in accordance with one embodiment of the present invention. FIG. 4 shows a case illustrating the benefits by using the fragment processor 201 to perform Z value iteration and multi sample color expansion in accordance with one embodiment of the present invention. As shown in FIG. 4, the GPU 110 can be configured to transfer pre-expanded color values and Z plane equation coverage values to the fragment processor 201 for iteration and expansion. This results in a very compact transfer of data across the low bandwidth graphics bus 120 to the fragment processor 201.

For example, as opposed to sending individual pixels and their values, the GPU 110 sends fragments to the fragment processor 201. These fragments are pre-expanded. The fragments undergo multi sample expansion within the fragment processor 201. Multi sample expansion is used in applications involving anti-aliasing and the like. A typical multi sampling expansion would take one sample of one fragment and expanded it into four samples (e.g., 4× anti-aliasing) or eight samples (e.g., 8× anti-aliasing). This much larger quantity of data is then transferred to the frame buffer 310 across the high bandwidth system memory bus 121 as opposed to the low bandwidth graphics bus 120. For example, in a typical anti-aliasing application, a given pixel can be expanded from one sample comprising 32 bits into eight samples comprising 32 bits each.

Similarly, the Z plane equation can be expanded into 4×, 8×, etc. samples per pixel by the fragment processor 201 from the plane equation for the original polygon. The resulting expanded Z data is then transferred from the fragment processor 201 across the high bandwidth system memory bus 121 to the frame buffer 310.

FIG. 5 shows a flowchart of the steps of a cooperative graphics rendering process 500 in accordance with one embodiment of the present invention. As depicted in FIG. 5, process 500 shows the basic steps involved in a cooperative graphics rendering process as implemented by a fragment processor (e.g., fragment processor 201) of a bridge (e.g., bridge of 105) and the GPU (e.g., GPU 110) of a computer system (e.g., computer system 100 of FIG. 2).

Process 500 begins in step 501, where the fragment processor 201 receives fragment pre-expanded color values from the GPU 110 via the graphics bus 120. In step 502, the fragment processor 201 performs a multi sample color value expansion for a plurality of pixels. In step 503, the fragment processor 201 receives a Z plane equation coverage value from the GPU 110. In step 504, the fragment processor 201 performs a Z plane iteration process to generate iterated Z values for a plurality of pixels. In step 505, as described above, the fragment processor 201 stores the resulting expanded color values and the resulting expanded Z values into the frame buffer 310 via the high bandwidth system memory bus 121. Subsequently, in step 506, the GPU 110 accesses the expanded color values and expanded Z values to render the image.

The foregoing descriptions of specific embodiments of the present invention have been presented for purposes of illustration and description. They are not intended to be exhaustive or to limit the invention to the precise forms disclosed, and many modifications and variations are possible in light of the above teaching. The embodiments were chosen and described in order to best explain the principles of the invention and its practical application, to thereby enable others skilled in the art to best utilize the invention and various embodiments with various modifications as are suited to the particular use contemplated. It is intended that the scope of the invention be defined by the claims appended hereto and their equivalents.

Claims

1. A system for cooperative graphics processing across a graphics bus comprising: a computer system comprising: a system memory;a graphics bus;a system memory bus,a graphics processor coupled to the graphics bus; anda bridge comprising a fragment processor, the bridge being coupled to the system memory via the system memory bus, and to the graphics processor via the graphics bus,wherein, the graphics processor and fragment processor are configured to perform a plurality of fragment processing operations cooperatively,wherein a graphics driver executing on the computer system balances the plurality of fragment processing operations between the fragment processor and the graphics processor by allocating at least a portion of the plurality of fragment processing to the fragment processor to be performed and allocating a remaining portion of the plurality of fragment processing operations to the graphics processor to be performed,further wherein, the system memory bus has a greater bandwidth than the graphics bus.
2. The system of claim 1, wherein the plurality of fragment processing operations comprises a plurality of raster operations on graphics data stored in the system memory.
3. The system of claim 1, wherein the graphics data stored in the system memory comprises a frame buffer used by the fragment processor and the graphics processor.
4. The system of claim 1, wherein the plurality of fragment processing operations comprises frame buffer blending on the graphics data in the system memory.
5. The system of claim 1, wherein the plurality of fragment processing operations comprises multi-sample expansion on graphics data received from the graphics processor and store resulting expanded data in the system memory.
6. The system of claim 1, wherein the plurality of fragment processing operations comprises evaluating a Z-plane equation coverage value for a plurality of pixels stored in the system memory, wherein the Z-plane equation coverage value is received from the graphics processor.
7. The system of claim 1, wherein the bridge is a North bridge chipset component of the computer system.
8. The system of claim 1, wherein the graphics processor is configured to use a portion of the system memory for frame buffer memory.
9. The system of claim 1, wherein the graphics processor is detachably coupled to the graphics bus by a connector.
10. The system of claim 1, wherein the graphics bus is an AGP graphics bus.
11. The system of claim 1, wherein the graphics bus is a PCI Express graphics bus.
12. The system of claim 1, wherein the graphics driver balances the plurality of fragment processing operations between the fragment processor and the graphics processor by allocating as large a share as possible of the plurality of fragment processing operations to the fragment processor.
13. The system of claim 1, wherein the system memory is used as frame buffer memory for the computer system.
14. The system of claim 1, wherein an amount of data access latency experienced by performing fragment processing operations in the fragment processor is reduced relative to an amount of data access latency experienced by performing the fragment processing operations in the graphics processor.
15. A bridge for implementing cooperative graphics processing with a graphics processor coupled to the bridge across a graphics bus comprising: a computer system comprising: a system memory bus interface comprising a system memory bus;a graphics bus interface comprising a graphics bus; anda fragment processor disposed in the bridge coupled to the system memory bus, the fragment processor being configured to perform a plurality of fragment processing operations cooperatively with a graphics processor coupled to the graphics bus,wherein, a graphics driver executing on the computer system balances the plurality of fragment processing operations between the fragment processor and the graphics processor by allocating at least a portion of the plurality of fragment processing operations to the fragment processor to be performed and allocating a remaining portion of the plurality of fragment processing operations to the graphics processor to be performed,further wherein, the system memory bus has a greater bandwidth than the graphics bus.
16. The bridge of claim 15, wherein the plurality of fragment processing operations comprises a plurality of raster operations on graphics data stored in the system memory.
17. The system of claim 15, wherein the bridge is configured to use a frame buffer in the system memory for the processing of graphics data.
18. The system of claim 15, wherein the plurality of fragment processing operations comprises frame buffer blending on the graphics data in the system memory.
19. The system of claim 15, wherein the plurality of fragment processing operations comprises multi-sample expansion on graphics data received from the graphics processor and store resulting expanded data in the system memory.
20. The system of claim 15 wherein the plurality of fragment processing operations comprises evaluating a Z plane equation coverage value for a plurality of pixels stored in the system memory, wherein the Z plane equation coverage value is received from the graphics processor.
21. The system of claim 15, wherein the graphics processor is detachably coupled to the graphics bus by a connector.
22. The system of claim 15, wherein the graphics driver balances the plurality of fragment processing operations between the fragment processor and the graphics processor by allocating as large a share as possible of the plurality of fragment processing operations to the fragment processor.
23. The system of claim 15, wherein the system memory bus is coupled to a system memory, and wherein the system memory is used as frame buffer memory for the computer system.
24. In a bridge of a computer system, a method for cooperatively implementing fragment processing operations with a graphics processor across a graphics bus in a computer system, comprising: in a computer system, receiving at a fragment processor pre-expanded color values from the graphics processor via the graphics bus, the fragment processor, graphics processor and graphics bus being disposed in the computer system;performing a multi-sample expansion on the color values resulting in expanded color value graphics data, the multi-sample expansion comprising at least a portion of the fragment processing to be performed cooperatively by the fragment processor and the graphics processor in the computer system across the graphics bus;storing the expanded color value graphics data into a frame buffer in a system memory through a system memory bus; andrendering an image to a display, the rendering performed by the graphics processor accessing the expanded color value graphics data in the frame buffer,wherein, the multi-sample expansion is balanced by a graphics driver executing on the computer system by allocating the portion of the fragment processing to the fragment processor to be performed and allocating a remaining portion of the fragment processing to be performed cooperatively to the graphics processor to be performed,further wherein, the system memory bus has a greater bandwidth than the graphics bus.
25. The method of claim 24, further comprising: receiving Z plane equation coverage values from the graphics processor via the graphics bus;performing a Z plane iteration process to generate iterated Z values for a plurality of pixels;storing the iterated Z values into the frame buffer; andrendering the image to the display, the rendering performed by the graphics processor accessing the iterated Z values in the frame buffer.
26. The method of claim 24, wherein the bridge is a North bridge chipset component of the computer system.
27. The method of claim 26, wherein the graphics processor is detachably coupled to the graphics bus by a connector.
28. The method of claim 27, wherein the graphics bus is an AGP graphics bus.
29. The method of claim 27, wherein the graphics bus is a PCI Express graphics bus.
30. The method of claim 24, wherein the graphics driver balances the plurality of fragment processing operations between the fragment processor and the graphics processor by allocating as large a share as possible of the plurality of fragment processing operations to the fragment processor.
31. The method of claim 24, wherein the system memory is used as frame buffer memory for the computer system.

US Referenced Citations (268)

Number	Name	Date	Kind
3091657	Stuessel	May 1963	A
3614740	Delagi et al.	Oct 1971	A
3940740	Coontz	Feb 1976	A
3987291	Gooding et al.	Oct 1976	A
4101960	Stokes et al.	Jul 1978	A
4541046	Nagashima et al.	Sep 1985	A
4566005	Apperley et al.	Jan 1986	A
4748585	Chiarulli et al.	May 1988	A
4885703	Deering	Dec 1989	A
4897717	Hamilton et al.	Jan 1990	A
4951220	Ramacher et al.	Aug 1990	A
4958303	Assarpour et al.	Sep 1990	A
4965716	Sweeney	Oct 1990	A
4965751	Thayer et al.	Oct 1990	A
4985848	Pfeiffer et al.	Jan 1991	A
4985988	Littlebury	Jan 1991	A
5036473	Butts et al.	Jul 1991	A
5040109	Bowhill et al.	Aug 1991	A
5047975	Patti et al.	Sep 1991	A
5175828	Hall et al.	Dec 1992	A
5179530	Genusov et al.	Jan 1993	A
5197130	Chen et al.	Mar 1993	A
5210834	Zurawski et al.	May 1993	A
5263136	DeAguiar et al.	Nov 1993	A
5276893	Savaria	Jan 1994	A
5327369	Ashkenazi	Jul 1994	A
5357623	Megory-Cohen	Oct 1994	A
5375223	Meyers et al.	Dec 1994	A
5388206	Poulton et al.	Feb 1995	A
5388245	Wong	Feb 1995	A
5392437	Matter et al.	Feb 1995	A
5408606	Eckart	Apr 1995	A
5418973	Ellis et al.	May 1995	A
5430841	Tannenbaum et al.	Jul 1995	A
5430884	Beard et al.	Jul 1995	A
5432905	Hsieh et al.	Jul 1995	A
5448496	Butts et al.	Sep 1995	A
5498975	Cliff et al.	Mar 1996	A
5513144	O'Toole	Apr 1996	A
5513354	Dwork et al.	Apr 1996	A
5517666	Ohtani et al.	May 1996	A
5522080	Harney	May 1996	A
5530457	Helgeson	Jun 1996	A
5560030	Guttag et al.	Sep 1996	A
5561808	Kuma et al.	Oct 1996	A
5574847	Eckart et al.	Nov 1996	A
5574944	Stager	Nov 1996	A
5578976	Yao	Nov 1996	A
5627988	Oldfield	May 1997	A
5634107	Yumoto et al.	May 1997	A
5638946	Zavracky	Jun 1997	A
5644753	Ebrahim et al.	Jul 1997	A
5649173	Lentz	Jul 1997	A
5666169	Ohki et al.	Sep 1997	A
5682552	Kuboki et al.	Oct 1997	A
5682554	Harrell	Oct 1997	A
5706478	Dye	Jan 1998	A
5754191	Mills et al.	May 1998	A
5761476	Martell	Jun 1998	A
5764243	Baldwin	Jun 1998	A
5766979	Budnaitis	Jun 1998	A
5784590	Cohen et al.	Jul 1998	A
5784640	Asghar et al.	Jul 1998	A
5796974	Goddard et al.	Aug 1998	A
5802574	Atallah et al.	Sep 1998	A
5809524	Singh et al.	Sep 1998	A
5812147	Van Hook et al.	Sep 1998	A
5835788	Blumer et al.	Nov 1998	A
5848254	Hagersten	Dec 1998	A
5909595	Rosenthal et al.	Jun 1999	A
5913218	Carney et al.	Jun 1999	A
5920352	Inoue	Jul 1999	A
5925124	Hilgendorf et al.	Jul 1999	A
5940090	Wilde	Aug 1999	A
5940858	Green	Aug 1999	A
5949410	Fung	Sep 1999	A
5950012	Shiell et al.	Sep 1999	A
5956252	Lau et al.	Sep 1999	A
5978838	Mohamed et al.	Nov 1999	A
5996996	Brunelle	Dec 1999	A
5999199	Larson	Dec 1999	A
5999990	Sharrit et al.	Dec 1999	A
6009454	Dummermuth	Dec 1999	A
6016474	Kim et al.	Jan 2000	A
6041399	Terada et al.	Mar 2000	A
6049672	Shiell et al.	Apr 2000	A
6049870	Greaves	Apr 2000	A
6065131	Andrews et al.	May 2000	A
6067262	Irrinki et al.	May 2000	A
6069540	Berenz et al.	May 2000	A
6072686	Yarbrough	Jun 2000	A
6073158	Nally et al.	Jun 2000	A
6092094	Ireton	Jul 2000	A
6094116	Tai et al.	Jul 2000	A
6108766	Hahn et al.	Aug 2000	A
6112019	Chamdani et al.	Aug 2000	A
6131152	Ang et al.	Oct 2000	A
6141740	Mahalingaiah et al.	Oct 2000	A
6144392	Rogers	Nov 2000	A
6150610	Sutton	Nov 2000	A
6189068	Witt et al.	Feb 2001	B1
6192073	Reader et al.	Feb 2001	B1
6192458	Arimilli et al.	Feb 2001	B1
6208361	Gossett	Mar 2001	B1
6209078	Chiang et al.	Mar 2001	B1
6219628	Kodosky et al.	Apr 2001	B1
6222552	Haas et al.	Apr 2001	B1
6230254	Senter et al.	May 2001	B1
6239810	Van Hook et al.	May 2001	B1
6247094	Kumar et al.	Jun 2001	B1
6249288	Campbell	Jun 2001	B1
6252610	Hussain	Jun 2001	B1
6255849	Mohan	Jul 2001	B1
6292886	Makineni et al.	Sep 2001	B1
6301600	Petro et al.	Oct 2001	B1
6307169	Sun et al.	Oct 2001	B1
6314493	Luick	Nov 2001	B1
6317819	Morton	Nov 2001	B1
6351808	Joy et al.	Feb 2002	B1
6363285	Wey	Mar 2002	B1
6363295	Akram et al.	Mar 2002	B1
6370617	Lu et al.	Apr 2002	B1
6437789	Tidwell et al.	Aug 2002	B1
6438664	McGrath et al.	Aug 2002	B1
6476808	Kuo et al.	Nov 2002	B1
6480927	Bauman	Nov 2002	B1
6490654	Wickeraad et al.	Dec 2002	B2
6496193	Surti et al.	Dec 2002	B1
6496902	Faanes et al.	Dec 2002	B1
6499090	Hill et al.	Dec 2002	B1
6525737	Duluk, Jr. et al.	Feb 2003	B1
6529201	Ault et al.	Mar 2003	B1
6545683	Williams	Apr 2003	B1
6597357	Thomas	Jul 2003	B1
6603481	Kawai et al.	Aug 2003	B1
6624818	Mantor et al.	Sep 2003	B1
6631423	Brown et al.	Oct 2003	B1
6631463	Floyd et al.	Oct 2003	B1
6657635	Hutchins et al.	Dec 2003	B1
6658447	Cota-Robles	Dec 2003	B2
6674841	Johns et al.	Jan 2004	B1
6690381	Hussain et al.	Feb 2004	B1
6700588	MacInnis et al.	Mar 2004	B1
6715035	Colglazier et al.	Mar 2004	B1
6732242	Hill et al.	May 2004	B2
6750870	Olarig	Jun 2004	B2
6809732	Zatz et al.	Oct 2004	B2
6812929	Lavelle et al.	Nov 2004	B2
6825848	Fu et al.	Nov 2004	B1
6839062	Aronson et al.	Jan 2005	B2
6862027	Andrews et al.	Mar 2005	B2
6891543	Wyatt	May 2005	B2
6915385	Leasure et al.	Jul 2005	B1
6944744	Ahmed et al.	Sep 2005	B2
6952214	Naegle et al.	Oct 2005	B2
6965982	Nemawarkar	Nov 2005	B2
6975324	Valmiki et al.	Dec 2005	B1
6976126	Clegg et al.	Dec 2005	B2
6978149	Morelli et al.	Dec 2005	B1
6978457	Johl et al.	Dec 2005	B1
6981106	Bauman et al.	Dec 2005	B1
6985151	Bastos et al.	Jan 2006	B1
7015909	Morgan, III et al.	Mar 2006	B1
7031330	Bianchini, Jr.	Apr 2006	B1
7032097	Alexander et al.	Apr 2006	B2
7035979	Azevedo et al.	Apr 2006	B2
7148888	Huang	Dec 2006	B2
7151544	Emberling	Dec 2006	B2
7154500	Heng et al.	Dec 2006	B2
7159212	Schenk et al.	Jan 2007	B2
7185178	Barreh et al.	Feb 2007	B1
7202872	Paltashev et al.	Apr 2007	B2
7260677	Vartti et al.	Aug 2007	B1
7305540	Trivedi et al.	Dec 2007	B1
7321787	Kim	Jan 2008	B2
7334110	Faanes et al.	Feb 2008	B1
7369815	Kang et al.	May 2008	B2
7373478	Yamazaki	May 2008	B2
7406698	Richardson	Jul 2008	B2
7412570	Moll et al.	Aug 2008	B2
7486290	Kilgariff et al.	Feb 2009	B1
7487305	Hill et al.	Feb 2009	B2
7493452	Eichenberger et al.	Feb 2009	B2
7545381	Huang et al.	Jun 2009	B2
7564460	Boland et al.	Jul 2009	B2
7750913	Parenteau et al.	Jul 2010	B1
7777748	Bakalash et al.	Aug 2010	B2
7852341	Rouet et al.	Dec 2010	B1
7869835	Zu	Jan 2011	B1
8020169	Yamasaki	Sep 2011	B2
8416251	Gadre et al.	Apr 2013	B2
8424012	Karandikar et al.	Apr 2013	B1
8493396	Karandikar et al.	Jul 2013	B2
8493397	Su et al.	Jul 2013	B1
8683184	Lew et al.	Mar 2014	B1
8687008	Karandikar et al.	Apr 2014	B2
8698817	Gadre et al.	Apr 2014	B2
8711161	Scotzniovsky et al.	Apr 2014	B1
8725990	Karandikar et al.	May 2014	B1
8736623	Lew et al.	May 2014	B1
8738891	Karandikar et al.	May 2014	B1
20010026647	Morita	Oct 2001	A1
20020005729	Leedy	Jan 2002	A1
20020026623	Morooka	Feb 2002	A1
20020031025	Shimano et al.	Mar 2002	A1
20020085000	Sullivan et al.	Jul 2002	A1
20020087833	Burns et al.	Jul 2002	A1
20020116595	Morton	Aug 2002	A1
20020130874	Baldwin	Sep 2002	A1
20020144061	Faanes et al.	Oct 2002	A1
20020158869	Ohba et al.	Oct 2002	A1
20020194430	Cho	Dec 2002	A1
20030001847	Doyle et al.	Jan 2003	A1
20030001857	Doyle	Jan 2003	A1
20030003943	Bajikar	Jan 2003	A1
20030014457	Desai et al.	Jan 2003	A1
20030016217	Vlachos et al.	Jan 2003	A1
20030016844	Numaoka	Jan 2003	A1
20030020173	Huff et al.	Jan 2003	A1
20030031258	Wang et al.	Feb 2003	A1
20030051091	Leung et al.	Mar 2003	A1
20030061409	RuDusky	Mar 2003	A1
20030067473	Taylor et al.	Apr 2003	A1
20030080963	Van Hook et al.	May 2003	A1
20030093506	Oliver et al.	May 2003	A1
20030115500	Akrout et al.	Jun 2003	A1
20030169269	Sasaki et al.	Sep 2003	A1
20030172326	Coffin, III et al.	Sep 2003	A1
20030188118	Jackson	Oct 2003	A1
20030204673	Venkumahanti et al.	Oct 2003	A1
20030204680	Hardage, Jr.	Oct 2003	A1
20030227461	Hux et al.	Dec 2003	A1
20040012597	Zatz et al.	Jan 2004	A1
20040073771	Chen et al.	Apr 2004	A1
20040073773	Demjanenko	Apr 2004	A1
20040103253	Kamei et al.	May 2004	A1
20040193837	Devaney et al.	Sep 2004	A1
20040205281	Lin et al.	Oct 2004	A1
20040205326	Sindagi et al.	Oct 2004	A1
20040212730	MacInnis et al.	Oct 2004	A1
20040215887	Starke	Oct 2004	A1
20040221117	Shelor	Nov 2004	A1
20040263519	Andrews et al.	Dec 2004	A1
20050012749	Gonzalez et al.	Jan 2005	A1
20050012759	Valmiki et al.	Jan 2005	A1
20050024369	Xie	Feb 2005	A1
20050060601	Gomm	Mar 2005	A1
20050071722	Biles	Mar 2005	A1
20050088448	Hussain et al.	Apr 2005	A1
20050140682	Sumanaweera et al.	Jun 2005	A1
20050239518	D'Agostino et al.	Oct 2005	A1
20050262332	Rappoport et al.	Nov 2005	A1
20050280652	Hutchins et al.	Dec 2005	A1
20060020843	Frodsham et al.	Jan 2006	A1
20060064517	Oliver	Mar 2006	A1
20060064547	Kottapalli et al.	Mar 2006	A1
20060103659	Karandikar et al.	May 2006	A1
20060152519	Hutchins et al.	Jul 2006	A1
20060152520	Gadre et al.	Jul 2006	A1
20060176308	Karandikar et al.	Aug 2006	A1
20060176309	Gadre et al.	Aug 2006	A1
20070076010	Swamy et al.	Apr 2007	A1
20070130444	Mitu et al.	Jun 2007	A1
20070285427	Morein et al.	Dec 2007	A1
20080016327	Menon et al.	Jan 2008	A1
20080278509	Washizu et al.	Nov 2008	A1
20090235051	Codrescu et al.	Sep 2009	A1
20120023149	Kinsman et al.	Jan 2012	A1

Foreign Referenced Citations (18)

Number	Date	Country
07-101885	Apr 1995	JP
H08-077347	Mar 1996	JP
H08-153032	Jun 1996	JP
08-297605	Dec 1996	JP
09-287217	Oct 1997	JP
09-287217	Nov 1997	JP
H09-325759	Dec 1997	JP
10-222476	Aug 1998	JP
11-190447	Jul 1999	JP
2000-148695	May 2000	JP
2001-022638	Jan 2001	JP
2003-178294	Jun 2003	JP
2004-252990	Sep 2004	JP
1998-018215	Aug 2000	KR
413766	Dec 2000	TW
436710	May 2001	TW
442734	Jun 2001	TW
093127712	Jul 2005	TW

Non-Patent Literature Citations (79)

Entry
Intel, Intel Architecture Software Deveopler's Manual, vol. 1: Basic Architecture 1997 p. 8-1.
Intel, Intel Architecture Software Deveopler's Manual, vol. 1: Basic Architecture 1999 p. 8-1, 9-1.
Intel, Pentium Processor Family Developer's Manual, 1997, pp. 2-13.
Fisher, Joseph A., Very Long Instruction Word Architecture and the ELI-512, ACM, 1993, pp. 140-150.
Hamacher, V. Carl et al., Computer Organization, Second Edition, McGraw Hill, 1984, pp. 1-9.
Kozyrakis, “A Media enhanced vector architecture for embedded memory systems,” Jul. 1999, http://digitalassets.lib.berkeley.edu/techreports/ucb/text/CSD-99-1059.pdf.
Brown, Brian; “Data Structure And Number Systems”; 2000; http://www.ibilce.unesp.br/courseware/datas/data3.htm.
“Alpha Testing State”; http://msdn.microsoft.com/library/en-us/directx9—c/directx/graphics/programmingguide/GettingStarted/Direct3Kdevices/States/renderstates/alphatestingstate.asp.
“Anti-aliasing”; http://en.wikipedia.org/wiki/Anti-aliasing.
“Vertex Fog”; http://msdn.microsoft.com/library/en-us/directx9—c/Vertex—fog.asp?frame=true.
NVIDIA Corporation, Technical Brief: Transform and Lighting; dated 1999; month unknown.
Graham, Susan L. et al., Getting Up to Speed: The future of Supercomputing, the National Academies Press, 2005, glossary.
Rosenberg, Jerry M., Dictionary of Computers, Information Processing & Telecommunications, 2nd Edition, John Wiley & Sons, 1987, pp. 102 and 338 (NVID-P001502).
Rosenberg, Jerry M., Dictionary of Computers, Information Processing & Telecommunications, 2nd Edition, John Wiley & Sons, 1987, pp. 305.
Graf, Rudolf F., Modern Dictionary of Electronics, Howard W. Sams & Company, 1988, pp. 273.
Graf, Rudolf F., Modern Dictionary of Electronics, Howard W. Sams & Company, 1984, pp. 566.
Graston et al. (Software Pipelining Irregular Loops On the TMS320C6000 VLIW DSP Architecture); Proceedings of the ACM SIGPLAN workshop on Languages, compilers and tools for embedded systems; pp. 138-144; Year of Publication: 2001.
Duca et al., A Relational Debugging Engine for Graphics Pipeline, International Conference on Computer Graphics and Interactive Techniques, ACM SIGGRAPH 2005, pp. 453-463, ISSN:0730-0301.
Gadre, S., Patent Application Entitled “Video Processor Having Scalar and Vector Components with Command FIFO for Passing Function Calls from Scalar to Vector”, U.S. Appl. No. 11/267,700, filed Nov. 4, 2005.
Gadre, S., Patent Application Entitled “Stream Processing in a Video Processor”, U.S. Appl. No. 11/267,599, filed Nov. 4, 2005.
Karandikar et al., Patent Application Entitled: “Multidemnsional Datapath Processing in a Video Processor”, U.S. Appl. No. 11/267,638, filed Nov. 4, 2005.
Karandikar et al., Patent Application Entitled: “A Latency Tolerant System for Executing Video Processing Operations”, U.S. Appl. No. 11/267,875, filed Nov. 4, 2005.
Gadre, S., Patent Application Entitled “Separately Schedulable Condition Codes For a Video Processor”, U.S. Appl. No. 11/267,793, filed Nov. 4, 2005.
Lew, et al., Patent Application Entitled “A Programmable DMA Engine for Implementing Memory Transfers for a Video Processor”, U.S. Appl. No. 11/267,777, filed Nov. 4, 2005.
Karandikar et al., Patent Application Entitled: “A Pipelined L2 Cache for Memory Transfers for a Video Processor”, U.S. Appl. No. 11/267,606, filed Nov. 4, 2005.
Karandikar, et al., Patent Application Entitled: “Command Acceleration in a Video Processor”, U.S. Appl. No. 11/267,640, filed Nov. 4, 2005.
Karandikar, et al., Patent Application Entitled “A Configurable SIMD Engine in a Video Processor”, U.S. Appl. No. 11/267,393, filed Nov. 4, 2005.
Karandikar, et al., Patent Application Entitled “Context Switching on a Video Processor Having a Scalar Execution Unit and a Vector Execution Unit”, U.S. Appl. No. 11/267,778, filed Nov. 4, 2005.
Lew, et al., Patent Application Entitled “Multi Context Execution on a Video Processor”, U.S. Appl. No. 11/267,780, filed Nov. 4, 2005.
Su, Z, et al., Patent Application Entitled: “State Machine Control for a Pipelined L2 Cache to Implement Memory Transfers for a Video Processor”, U.S. Appl. No. 11/267,119, filed Nov. 4, 2005.
Free On-Line Dictionary of Computing (FOLDOC), defintion of “video”, from foldoc.org/index.cgi?query=video&action=Search, May 23, 2008.
FOLDOC, definition of “frame buffer”, from foldoc.org/index.cgi?query=frame+buffer&action=Search, Oct. 3, 1997.
FOLDOC, definition of “motherboard”, from foldoc.org/index.cgi?query=motherboard&action=Search, Aug. 10, 2000.
FOLDOC, definition of “separate compilation”, from foldoc.org/index.cgi?query=separate+compilation&action=Search, Feb. 19, 2005.
FOLDOC, definition of “vector processor”, http://foldoc.org/, Sep. 11, 2003.
FOLDOC (Free On-Line Dictionary of Computing), defintion of X86, Feb. 27, 2004.
FOLDOC, definition of “superscalar,” http://foldoc.org/, Jun. 22, 2009.
FOLDOC, definition of Pentium, Sep. 30, 2003.
Wikipedia, definition of “scalar processor,” Apr. 4, 2009.
Wikipedia, entry page defining term “SIMD”, last modified Mar. 17, 2007.
FOLDOC, Free Online Dictionary of Computing, defintion of SIMD, foldoc.org/index.cgi?query=simd&action=Search, Nov. 4, 1994.
Definition of “queue” from Free on-Line Dictionary of Computing (FOLDOC), http://folddoc.org/index.cgi?query=queue&action=Search, May 15, 2007.
Definition of “first-in first-out” from FOLDOC, http://foldoc.org/index.cgi?query=fifo&action=Search, Dec. 6, 1999.
Definition of “block” from FOLDOC, http://foldoc.org/index.cgi?block, Sep. 23, 2004.
Wikipedia, definition of Multiplication, accessed from en.wikipedia.org/w/index.php?title=Multiplication&oldid=1890974, published Oct. 13, 2003.
Graham, Susan L. et al., Getting Up to Speed: The future of Supercomputing, the National Academies Press, 2005, glossary, Feb. 2005.
Rosenberg, Jerry M., Dictionary of Computers, Information Processing & Telecommunications, 2nd Edition, John Wiley & Sons, 1987, pp. 102 and 338 (NVID-P001502), Dec. 1987.
Rosenberg, Jerry M., Dictionary of Computers, Information Processing & Telecommunications, 2nd Edition, John Wiley & Sons, 1987, pp. 305, Dec. 1987.
Graf, Rudolf F., Modern Dictionary of Electronics, Howard W. Sams & Company, 1988, pp. 273, Dec. 1988.
Graf, Rudolf F., Modern Dictionary of Electronics, Howard W. Sams & Company, 1984, pp. 566, Dec. 1988.
Wikipeida, definition of “subroutine”, published Nov. 29, 2003, four pages.
Graston et al. (Software Pipelining Irregular Loops On the TMS320C6000 VLIW DSP Architecture); Proceedings of the ACM SIGPLAN workshop on Languages, compilers and tools for embedded systems; pp. 138-144; Year of Publication: 2001, Oct. 2001.
SearchStorage.com Definitions, “Pipeline Burst Cache,” Jul. 31, 2001, url: http://searchstorage.techtarget.com/sDefinition/0,,sid5—gci214414,00.html.
Parhami, Behrooz, Computer Arithmetic: Algorithms and Hardware Designs, Oxford University Press, Jun. 2000, pp. 413-418.
gDEBugger, graphicRemedy, http://www.gremedy.com, Aug. 8, 2006.
Duca et al., A Relational Debugging Engine for Graphics Pipeline, International Conference on Computer Graphics and Interactive Techniques, ACM SIGGRAPH 2005, pp. 453-463, ISSN:0730-0301, Jul. 2005.
Merriam-Webster Dictionary Online; Definition for “program”; retrieved Dec. 14, 2010.
Intel, Intel Architecture Software Deveopler's Manual, vol. 1: Basic Architecture 1997 p. 8-1, Jan. 1997.
Intel, Intel Architecture Software Deveopler's Manual, vol. 1: Basic Architecture 1999 p. 8-1, 9-1, May 1999.
Intel, Intel Pentium III Xeon Processor at 500 and 550Mhz, Feb. 1999.
Intel, Intel MMX Technology at a Glance, Jun. 1997.
Intel, Pentium Processor Family Developer's Manual, 1997, pp. 2-13, Oct. 199.
Intel, Pentium processor with MMX Technology at 233Mhz Performance Brief, Jan. 1998, pp. 3 and 8.
PCreview, article entitled “What is a Motherboard”, from www.pcreview.co.uk/articles/Hardware/What—is—a—Motherboard., Nov. 22, 2005.
Wikipedia, defintion of “vector processor”, http://en.wikipedia.org/, May 14, 2007.
Fisher, Joseph A., Very Long Instruction Word Architecture and the ELI-512, ACM, 1993, pp. 140-150, Jun. 1993.
Quinnell, Richard A. “New DSP Architectures Go “Post-Harvard” for Higher Performance and Flexibility” Techonline; posted May 1, 2002.
IBM TDB, Device Queue Management, vol. 31 Iss. 10, pp. 45-50, Mar. 1, 1989.
Hamacher, V. Carl et al., Computer Organization, Second Edition, McGraw Hill, 1984, pp. 1-9, May 1984.
Kozyrakis, “A Media enhanced vector architecture for embedded memory systems,” Jul. 1999, http://digitalassets.lib.berkeley.edu/techreports/ucb/text/CSD-99/1059.pdf.
HPL-PD A Parameterized Research Approach—May 31, 2004 http://web.archive.org/web/*/www.trimaran.org/docs/5—hpl-pd.pdf.
Hutchins E., SC10: A Video Processor And Pixel-Shading GPU for Handheld Devices; presented at the Hot Chips conferences on Aug. 23, 2004.
Brown, Brian; “Data Structure And Number Systems”; 2000; http://www.ibilce.unesp.br/courseware/datas/data3.htm, Mar. 2000.
“Alpha Testing State”; http://msdn.microsoft.com/library/en-us/directx9—c/directx/graphics/programmingguide/GettingStarted/Direct3Kdevices/States/renderstates/alphatestingstate.asp, Sep. 2004.
“Anti-aliasing”; http://en.wikipedia.org/wiki/Anti-aliasing, Mar. 2004.
“Vertex Fog”; http://msdn.microsoft.com/library/en-us/directx9—c/Vertex—fog.asp?frame=true, Apr. 2008.
Wilson D., NVIDIA's Tiny 90nm G71 and G73: GeForce 7900 and 7600 Debut; at http://www.anandtech.com/show/1967/2; dated Sep. 3, 2006, retrieved Jun. 16, 2011.
Woods J., Nvidia GeForce FX Preview, at http://www.tweak3d.net/reviews/nvidia/nv30preview/1.shtml; dated Nov. 18, 2002; retrieved Jun. 16, 2011.
NVIDIA Corporation, Technical Brief: Transform and Lighting; dated 1999; month unknown, Apr. 1999.

Method and system for implementing fragment operation processing across a graphics bus interconnect

Information

Patent Number

Date Filed

Date Issued

Inventors

Original Assignees

Examiners

CPC

Field of Search

US

International Classifications

Term Extension

Abstract

Description

Claims

US Referenced Citations (268)

Foreign Referenced Citations (18)

Non-Patent Literature Citations (79)