3D rendering engines often improve rendering performance by using texture samplers that can concurrently process more than one pixel. For example, some texture samplers can sample textures concurrently for blocks of two or more adjacent pixels. When processing adjacent pixels a typical texture sampler will undertake a separate texture address calculation for each pixel, fetch from memory the texture data (i.e., texels) for vertices bounding each texture address and then filter or blend the fetched texture data to obtain blended pixel values. However, using software to calculate separate texture addresses for adjacent pixels increases computation times while implementing the texture address calculations in hardware consumes more die space.
In some circumstances, 3D rendering engines may be used to rasterize primitives where the texture is aligned to the pixels being rendered. Aligned textures may arise where there is no rotation or perspective operations being applied to the textured primitive and hence the texture coordinates (e.g., u, v) are aligned to the pixel coordinates (e.g., x, y). The standard approach of undertaking separate texture address calculations for adjacent pixels fails to take advantage of the rectilinear relationship between texture coordinates and pixel coordinates for aligned textures.
The accompanying drawings, incorporated in and constituting a part of this specification, illustrate one or more implementations consistent with the principles of the invention and, together with the description of the invention, explain such implementations. The drawings are not necessarily to scale, the emphasis instead being placed upon illustrating the principles of the invention. In the drawings,
The following description refers to the accompanying drawings. Among the various drawings the same reference numbers may be used to identify the same or similar elements. While the following description provides a thorough understanding of the various aspects of the claimed invention by setting forth specific details such as particular structures, architectures, interfaces, techniques, etc., such details are provided for purposes of explanation and should not be viewed as limiting. Moreover, those of skill in the art will, in light of the present disclosure, appreciate that various aspects of the invention claimed may be practiced in other examples or implementations that depart from these specific details. At certain junctures in the following disclosure descriptions of well known devices, circuits, and methods have been omitted to avoid clouding the description of the present invention with unnecessary detail.
In addition, those skilled in the art may recognize that a 3D rendering engine, such as engine 100, may be tasked with rendering pixels in a compositing context in which the engine may undertake 3D operations that may include rendering objects having textures that exhibit rotation or perspective relative to pixel coordinate space and other rendering operations, such as “blit”-type operations, in which textures may be aligned to pixel coordinate space. For example, those skilled in the art will recognize that such a compositing context may be encountered when using a 3D rendering engine to render High Definition Digital Video Disc (HD-DVD) data that includes both 3D data streams and 2D data streams where the 2D data streams may include primitives exhibiting aligned textures. However, the invention is not limited to compositing contexts, HD-DVD or otherwise.
Texture sampler 102 may, in accordance with some implementations of the invention, comprise any graphics processing logic and/or hardware, software, and/or firmware, capable of using address generator logic 106 in conjunction with a texture address for one pixel and associated texture offsets provided by shader 110 to determine texture addresses for one or more pixels adjacent to that pixel. Thus, address generator logic 106 may receive texture coordinates and associated texture offsets and in response may generate corresponding texture map addresses. Sampler 102 may then use those addresses to access the corresponding raw textures stored and/or held in memory 108, may use that texture data to generate filtered pixel values and may provide those filtered pixels to shader 110. This capability of texture sampler 102 and associated address generator logic 106 will be described in greater detail below.
Rasterizer 104 may be capable of processing graphics primitives, such as triangle primitives, provided by a setup module (not shown) to generate pixels associated with a pixel or “screen” coordinate system. In doing so, rasterizer 104 may generate attributes for each pixel encountered in traversing, for example, a given triangle by interpolating triangle vertex coordinates (e.g., (u, v) vertex texture coordinates). Rasterizer 104 may provide pixels and associated attributes to shader 110.
Those skilled in the art may recognize that elements of engine 100, such as rasterizer 104 may generate pixel fragments where such pixel fragments may comprise integer x and y grid coordinates, a color value, depth values, etc. in addition to texture coordinates for a given pixel. However, for the most part, such details are beyond the scope of the invention and, in order to not obscure description of implementations of the invention, the term “pixel” or “pixel data” will be used throughout this disclosure even though those skilled in the art may recognize that rasterizer 104 may provide shader 110 with pixel fragments (e.g., including pixel texture addresses). Hence, while sampler 102 may be described as providing filtered pixel fragments (i.e., filtered pixel color values) to shader 110, in the interests of clarity this disclosure will describe sampler 102 as providing filtered pixels to shader 110.
Texture memory 108 may comprise any memory device or mechanism suitable for storing and/or holding one or more texture maps specifying texel data (e.g., in the form of a texture map associated with a texture coordinate system). While memory 108 may comprise any volatile or non-volatile memory technology such as Random Access Memory (RAM) memory or Flash memory, the invention is in no way limited by the type of memory employed for use as memory 108.
Pixel shader 110 may comprise any graphics processing logic and/or hardware, software, and/or firmware, capable of receiving filtered pixel data from texture sampler 102. For example, shader 110 may comprise a programmable execution unit. While those skilled in the art will recognize that pixel shaders such as shader 110 often undertake processes such as implementing various per pixel shading routines, such functionality is outside the scope of the invention and will not be discussed further. While
In accordance with some implementations of the invention, shader 110 may further be capable of providing pixel texture coordinate values for use by texture sampler 102 to access or fetch texture data (e.g., texel values) from texture memory 108. Further, in accordance with some implementations of the invention, shader 110 may provide sampler 102 with a texture address for a first or initial pixel of a pixel block including two or more contiguous pixels along with associated texture offsets as will be explained in greater detail below.
As
Process 300 may begin with receiving a first pixel's texture address [act 302] and receiving associated texture offsets [act 304]. In some implementations of the invention, acts 302 and 304 may involve sampler 102 receiving message 200 from shader 110 where message 200 includes a first pixel's address in the form of (u, v) texture coordinates 204 and associated texture coordinate offsets 206 (Δu) and 208 (Δv). Thus, referring also to
Process 300 may continue with the determination of the texture address of a second pixel by applying the offsets to the first pixel's address [act 306]. In some implementations of the invention, address generator logic 106 may use the texture address of the first pixel and the associated texture offsets supplied in respective acts 302 and 304 to determine or generate the second pixel's texture address. For example, logic 106 may add the offset value Δu to the texture coordinates (u1, v1) of pixel 412 to generate the texture address or coordinates (u1+Δu, v1) of pixel 413 (p2) that is adjacent to pixel 412 in the x dimension. Similarly, logic 106 may, for example, implement act 306 by adding the offset value Δv to the texture coordinates (u1, v1) of pixel 412 to generate the texture address or coordinates (u1, v1+Δv) of pixel 415 (p3) that is adjacent to pixel 412 in the y dimension.
Process 300 may then continue with the fetching of the second pixel's texture data [act 308]. One way to implement act 308 is to have sampler 102 use the second pixel's texture address determined in act 306 to access or fetch associated texture data from texture memory 108. Thus, continuing one example from above, if address generator logic 106 determines the texture address (u1+Δu, v1) for pixel 413 (p2) in act 306 then sampler 102 may use that address to fetch, from memory 108, texture data associated with those texels 402, 403, 405 and 406 of map 410 that bound the texture address of pixel 413.
Process 300 may then continue with the filtering of the second pixel's texture data [act 310]. In some implementations of the invention, act 310 may be undertaken by sampler 102. For example, sampler 102 may use the texture coordinate (u1+Δu, v1) determined for pixel 413 in act 306 to undertake an appropriate weighting of the texture data for texels 402, 403, 405 and 406 fetched in act 306 to filter that texture data in act 310 using, for example, well known bilinear filtering methods. The invention is not, however, limited to any particular filtering method in act 310 nor is the invention limited to having sampler 102 perform act 310.
Process 300 may continue with the determination of the texture address of a third pixel by applying the offsets to the second pixel's address [act 312]. In some implementations of the invention, address generator logic 106 may use the texture coordinates of the second pixel determined in act 306 and the texture offsets supplied in act 304 to determine or generate the third pixel's texture address. For example, logic 106 may add the offset value Δu to the texture address (u1+Δu, v1) of pixel 413 to generate the texture address or coordinates (u1+2Δu, v1) of pixel 414 (p3) that is adjacent to pixel 413 in the x dimension.
Process 300 may then continue with the fetching of the third pixel's texture data [act 314]. One way to implement act 314 is to have sampler 102 use the third pixel's texture address determined in act 312 to access or fetch texture data from texture memory 108. Thus, continuing an example from above, if address generator logic 106 determines the texture address (u1+2Δu, v1) for pixel 414 in act 312 then sampler 102 may use that address to fetch, from memory 108, texture data associated with those texels 403, 404, 406 and 407 of map 410 that bound the texture address of pixel 414.
Process 300 may then continue with the filtering of the third pixel's texture data [act 316]. In some implementations of the invention, act 316 may be undertaken by sampler 102. For example, sampler 102 may use the texture coordinate (u1+2Δu, v1) determined for pixel 414 in act 312 to undertake an appropriate weighting of the texture data for texels 403, 404, 406 and 407 fetched in act 314 to filter that texture data in act 316 using, for example, well known bilinear filtering methods. The invention is not, however, limited to any particular filtering method in act 316 nor is the invention limited to having sampler 102 perform act 316.
The acts shown in
In other implementations in accordance with the invention, a texture sampler may, for example, undertake multiple concurrent sequences of acts 302-310. In other words a sampler may, concurrently, apply texture offsets to the addresses of two separate “first” pixels (e.g., pixels 412 and 413) to determine the texture addresses for two separate “second” pixels (e.g., pixels 416 and 417). Moreover, process 300 may be extended beyond a three pixel scheme. For example, a first or initial pixel of act 302 may comprise a first pixel (e.g., upper, left-hand pixel) of a 32-pixel block of pixels (e.g., four rows pixels having eight pixels each) and process 300 may be extended to use that first pixel's texture address to determine all texture addresses of pixels in the block by applying appropriate combinations of texture offsets.
In addition, the invention is not limited to any particular pixels processing scheme. In other words, for example, the invention contemplates addition or subtraction of texture offsets and applies equally well to processing pixels along or across pixel row or pixel column positions. For example, referring to
Alternatively, for example, acts 306-310 may comprise a texture sampler subtracting offset Δu from the texture coordinates (u1+2Δu, v1+Δv) of pixel 417 to determine the texture address (u1+Δu, v1+Δv) of adjacent pixel 416, obtaining or fetching the texture data for texels bounding pixel 416's address, and then filtering that texture data. Then acts 312-316 might comprise that sampler adding offset Δv to the texture address of pixel 416 to determine the texture address (u1+Δu, v1+2Δv) of adjacent pixel 419, obtaining or fetching the texture data for texels bounding pixel 419's address, and then filtering that texture data.
Alternatively, for another example, acts 306-310 may comprise a texture sampler adding offsets Δu and Δv to the texture coordinates (u1+Δu, v1+Δv) of pixel 416 to determine the texture address (u1+2Δu, v1+2Δv) of adjacent pixel 420, obtaining or fetching the texture data for texels bounding pixel 420's address, and then filtering that texture data. Then acts 312-316 might comprise that sampler subtracting offset Δv from the texture address of pixel 420 to determine the texture address (u1+2Δu, v1+Δv) of adjacent pixel 417, obtaining or fetching the texture data for texels bounding pixel 417's address, and then filtering that texture data. Clearly, many other such pixel processing schemes are contemplated by the claimed invention.
System 500 may assume a variety of physical implementations. For example, system 500 may be implemented in a personal computer (PC), a networked PC, a server computing system, a handheld computing platform (e.g., a personal digital assistant (PDA)), a gaming system (portable or otherwise), a 3D capable cellular telephone handset, etc. Moreover, while all components of system 500 may be implemented within a single device, such as a system-on-a-chip (SOC) integrated circuit (IC), components of system 500 may also be distributed across multiple ICs or devices. For example, host processor 502 along with components 506, 512, and 514 may be implemented as multiple ICs contained within a single PC while graphics processor 504 and components 508 and 516 may be implemented in a separate device such as a television or other display coupled to host processor 502 and components 506, 512, and 514 through communications pathway 510.
Host processor 502 may comprise a special purpose or a general purpose processor including any control and/or processing logic, hardware, software and/or firmware, capable of providing graphics processor 504 with 3D graphics data and/or instructions. Processor 502 may perform a variety of 3D graphics calculations such as 3D coordinate transformations, etc. the results of which may be provided to graphics processor 504 over bus 510 and/or that may be stored in memories 506 and/or 508 for eventual use by processor 504.
In one implementation, host processor 502 may be capable of performing any of a number of tasks that support the simplification of 3D texture address computations based on aligned, non-perspective objects. These tasks may include, for example, although the invention is not limited in this regard, providing 3D graphics data to graphics processor 504, placing one or more texture maps in memory 508, downloading microcode (via antenna 515 and interfaces 514) to processor 504, initializing and/or configuring registers within processor 504, interrupt servicing, and providing a bus interface for uploading and/or downloading 3D graphics data. In alternate implementations, some or all of these functions may be performed by graphics processor 504. While
Graphics processor 504 may comprise any processing logic, hardware, software, and/or firmware, capable of processing graphics data. In one implementation, graphics processor 504 may implement a 3D graphics architecture capable of processing graphics data in accordance with one or more standardized rendering application programming interfaces (APIs) such as OpenGL 2.0™ (“The OpenGL Graphics System: A Specification” (Version 2.0; Oct. 22, 2004)) and DirectX 9.0™ (Version 9.0c; Aug. 8, 2004) to name a few examples, although the invention is not limited in this regard. Graphics processor 504 may process 3D graphics data provided by host processor 502, held or stored in memories 506 and/or 508, and/or provided by sources external to system 500 and obtained over bus 510 from interfaces 512 and/or 514.
Graphics processor 504 may receive 3D graphics data in the form of 3D scene data and process that data to provide image data in a format suitable for conversion by display processor 516 into display-specific data. In addition, graphics processor 504 may implement a variety of 3D graphics processing components and/or stages (not shown) such as an applications stage, a geometry stage and/or a rasterizer stage in addition to one or more texture samplers. Texture samplers implemented by graphics processor 504 may fetch or access texture data stored or held in either or both of memories 506 and 508. Further, in accordance with some implementations of the invention, graphics processor 504 may implement one or more texture samplers capable of using a starting or initial texture address for a pixel and associated texture address offsets to determine or generate texture addresses for pixels adjacent to that pixel so that processor 504 may enable simplification of 3D texture address computations based on aligned, non-perspective objects.
Bus or communications pathway(s) 510 may comprise any mechanism for conveying information (e.g., graphics data, instructions, etc.) between or amongst any of the elements of system 500. For example, although the invention is not limited in this regard, communications pathway(s) 510 may comprise a multipurpose bus capable of conveying, for example, instructions (e.g., macrocode) between processor 502 and processor 504. Alternatively, pathway(s) 510 may comprise a wireless communications pathway.
Display processor 516 may comprise any processing logic, hardware, software, and/or firmware, capable of converting rasterized image data supplied by graphics processor 504 into a format suitable for driving a display (i.e., display-specific data). For example, while the invention is not limited in this regard, processor 504 may provide image data to processor 516 in a specific color data format, for example in a compressed red-green-blue (RGB) format, and processor 516 may process such RGB data by generating, for example, corresponding LCD drive data levels etc. Although
Thus, by taking advantage of the circumstances when a texture is aligned to the pixels being rendered (i.e., where no rotation or perspective occurs relative to the pixel's coordinate system) the address computation for a given pixel may, in accordance with some implementations of the invention, be simplified to the addition of offsets to the address of an adjacent pixel. In this manner, a texture address compute engine may use fewer computations to generate texture addresses or fewer parallel computation units may be employed thereby reducing die size.
While the foregoing description of one or more instantiations consistent with the claimed invention provides illustration and description of the invention it is not intended to be exhaustive or to limit the scope of the invention to the particular implementations disclosed. Clearly, modifications and variations are possible in light of the above teachings or may be acquired from practice of various implementations of the invention. For example, while
No device, element, act, data type, instruction etc. set forth in the description of the present application should be construed as critical or essential to the invention unless explicitly described as such. Also, as used herein, the article “a” is intended to include one or more items. Moreover, when terms or phrases such as “coupled” or “responsive” or “in communication with” are used herein or in the claims that follow, these terms are meant to be interpreted broadly. For example, the phrase “coupled to” may refer to being communicatively, electrically and/or operatively coupled as appropriate for the context in which the phrase is used. Variations and modifications may be made to the above-described implementation(s) of the claimed invention without departing substantially from the spirit and principles of the invention. All such modifications and variations are intended to be included herein within the scope of this disclosure and protected by the following claims.
Number | Name | Date | Kind |
---|---|---|---|
4718024 | Guttag et al. | Jan 1988 | A |
5751290 | Lee et al. | May 1998 | A |
5764237 | Kaneko | Jun 1998 | A |
6133923 | Ozawa | Oct 2000 | A |
6208350 | Herrera | Mar 2001 | B1 |
6252564 | Albert et al. | Jun 2001 | B1 |
6340974 | Nagashima | Jan 2002 | B1 |
6456287 | Kamen et al. | Sep 2002 | B1 |
6501481 | Wood et al. | Dec 2002 | B1 |
6538658 | Herrera | Mar 2003 | B1 |
6567095 | Wood | May 2003 | B2 |
6614446 | Van Overveld | Sep 2003 | B1 |
6778181 | Kilgariff et al. | Aug 2004 | B1 |
6864900 | Wasserman et al. | Mar 2005 | B2 |
7106323 | Laws et al. | Sep 2006 | B2 |
7145570 | Emberling et al. | Dec 2006 | B2 |
7158141 | Chung et al. | Jan 2007 | B2 |
7307638 | Leather et al. | Dec 2007 | B2 |
7405735 | Koguchi | Jul 2008 | B2 |
7411592 | Dunn | Aug 2008 | B1 |
20020050988 | Petrov et al. | May 2002 | A1 |
20020171672 | Lavelle et al. | Nov 2002 | A1 |
20030142104 | Lavelle et al. | Jul 2003 | A1 |
20030169271 | Emberling et al. | Sep 2003 | A1 |
20040254972 | Woo et al. | Dec 2004 | A1 |
20050259104 | Koguchi | Nov 2005 | A1 |
20060164429 | Mantor et al. | Jul 2006 | A1 |
20070002047 | Desgranges et al. | Jan 2007 | A1 |
20070091088 | Jiao et al. | Apr 2007 | A1 |
Number | Date | Country | |
---|---|---|---|
20080001964 A1 | Jan 2008 | US |