One or more aspects of the invention generally relate to graphics processing, and more particularly to optimizing power usage and performance in a multi-processor graphics processing system.
Conventional low power systems including a graphics processor, such as system 100 shown in
High performance graphics processors offer greater graphics processing throughput which contributes to increased power usage compared with a low power graphics processor, such as graphics processor 140. The increased graphics processing throughput may be achieved by operating at a higher clock rate, including two or more graphics processing pipelines, and using wider and/or faster internal and external interfaces. The higher performance graphics processor is implemented in a larger die size than graphics processor 140 in order to include more transistors. Even when a high performance graphics processor is not processing graphics data it contributes to overall system power consumption due to the static power resulting from transistor leakage. Therefore the static power of a high performance graphics processor is greater than the static power of a low power graphics processor. Consequently, high performance graphics processors are not used in conventional portable systems which are battery powered.
Accordingly, it is desirable to minimize overall power consumption while improving graphics processing performance.
The current invention involves new systems and methods for optimizing power usage and performance during graphics data processing. A multi-processor graphics processing system includes a low power graphics processor and a high performance graphics processor. When a low power condition exists only the low power graphics processor is used to process graphics data and the high performance graphics processor is turned off. When turned off, the high performance graphics processor does not consume either static or dynamic power. When the low power condition does not exist, the high performance graphics processor is turned on and the low power graphics processor and the high performance graphics processor are used to process the graphics data.
Various embodiments of the invention include a system for processing data. The system includes a first processing unit, a second processing unit, and a switch coupling the first processing unit to the second processing unit. The first processing unit is configured to process data at a first performance level and to consume a first level of power. The second processing unit is configured to process data at a second performance level and to consume a second level of power, wherein the second level of power is greater than the first level of power.
Various embodiments of a method of the invention for processing graphics data in a multi-processor graphics processing system, including determining whether a low power condition exists, processing the graphics data to produce processed graphics data using a low power graphics processor, if the low power condition exists, and processing the graphics data to produce the processed graphics data using the low power graphics processor and a high performance graphics processor, if the low power condition does not exist.
Various embodiments of a method of the invention for optimizing power usage and performance of a multi-processor data processing system, including determining whether a low power condition exists, disabling a high performance processor within the multi-processor data processing system, if the low power condition exists, and enabling the high performance processor within the multi-processor graphics data system, if the low power condition does not exist.
Accompanying drawing(s) show exemplary embodiment(s) in accordance with one or more aspects of the present invention; however, the accompanying drawing(s) should not be taken to limit the present invention to the embodiment(s) shown, but are for explanation and understanding only.
In the following description, numerous specific details are set forth to provide a more thorough understanding of the present invention. However, it will be apparent to one of skill in the art that the present invention may be practiced without one or more of these specific details. In other instances, well-known features have not been described in order to avoid obscuring the present invention.
When multiple processing units are included within a portable system, such as a laptop computer, palm-sized computer, tablet computer, game console, cellular telephone, hand-held device, or the like, one or more of the multiple graphics processing units may be enabled or disabled as needed to provide deliver a particular data processing performance or to adapt to a particular power environment. Therefore, the data processing performance and power consumption may be optimized to deliver the highest possible performance for the lowest possible power consumption.
In some embodiments of system 200, chipset 230 may include a system memory bridge and an input/output (I/O) bridge that may include several interfaces such as, Advanced Technology Attachment (ATA) bus, Universal Serial Bus (USB), Peripheral component interface (PCI), or the like. Switch 260 provides an interface between chipset 230 and each of graphics processor 250 and primary graphics processor 240 via a connection 251 and a connection 241, respectively. In some embodiments of switch 260, switch 260 provides an indirect interface between graphics processor 250 and primary graphics processor 240 through the combination of connections 251 and 241. Switch 260 may also include interfaces to other devices.
In some embodiments the present invention, switch 260 transfers over connections 241 and 251 are performed using an industry standard protocol such as PCI-Express™ and switch 260, graphics processor 250, and primary graphics processor 240, each include an interface unit corresponding to the industry standard protocol. Primary graphics processor 240 outputs image data to a display 270. Display 270 may include one or more display devices, such as a cathode ray tube (CRT), flat panel display, or the like. In addition to display 270, primary graphics processor 240 is also coupled to a primary frame buffer 245 which may be used to store graphics data, image data, and program instructions. Graphics processor 250 is coupled to a frame buffer 255 which may also be used to store graphics data, image data, and program instructions.
Primary graphics processor 240 is a low power device, particularly well-suited for portable devices which may rely on battery power. Graphics processor 250 is a high performance graphics device which consumes more power than primary graphics processor 240 and offers enhanced graphics performance including image quality features and/or higher graphics processing throughput, e.g., frame rate, fill rate, or the like. Although system 200 as shown is a multi-processor graphics processing system, alternate embodiments of system 200 may process other types of data, such as audio data, multi-media data, or the like. In those alternate embodiments graphics processor 250 is replaced with a high performance data processing device and primary graphics processor 240 is a low power data processing device. Likewise, graphics driver 205 is replaced with one or more corresponding device drivers.
In some embodiments of system 200 graphics driver 205 enables or disables graphics processor 250 responsive to a change in a low power condition, as described in conjunction with
Graphics driver 205 may load balance graphics processing between graphics processor 250 and primary graphics processor 240. For example, graphics processor 250 may process a larger portion of an image than primary graphics processor 240. In some embodiments of the present invention, graphics processor 250 may process the entire image and primary graphics processor may receive the image data from graphics processor 250 via switch 260. In other embodiments of the present invention, host processor 220 controls the transfer of the image data from graphics processor 250 to primary graphics processor 240. Therefore, the image data must pass through interface 251, switch 260, chipset 230, main memory 210, and back through chipset 230, switch 260, and interface 241 to reach primary graphics processor 240.
When graphics interface 248 is used to transfer graphics data, the amount of bandwidth needed to transfer graphics data over interfaces 241 and 251 is reduced. Therefore, the bus width and/or speed of interfaces 241 and 251 may be decreased, reducing the power consumed by interfaces 241 and 251. Furthermore, transferring image data from graphics processor 250 to primary graphics processor 240 does not require passing the image data through switch 260, chipset 230, or main memory 210. Therefore, dynamic power consumed by switch 260, chipset 230, and main memory 210 may be reduced. However, the power savings are offset by the power consumption of graphics interface 248. In some embodiments of the present invention, graphics interface 248 is 4 or 8 bits wide. In those embodiments, the power consumed by graphics interface 248 is less than the power consumed by comparatively wider interfaces between main memory 210, chipset 230, and switch 260.
In some embodiments of the present invention, interface 251 and interface 241 are based on the PCI-Express™ standard and may each support 16 lanes. In other embodiments of the present invention, interface 251 and interface 241 may support less than or more than 16 lanes. Graphics driver 205 measures the amount of bandwidth used during graphics processing for interface 251 and interface 241 and dynamically resizes the number of lanes allocated for interface 251 and the number of lanes allocated for interface 241. The power consumed by interfaces 241 and 251 is reduced as the number of lanes is reduced for each of interfaces 241 and 251, thereby optimizing the power consumption dependent on the bandwidth needed for the graphics processing performed by graphics processor 250 and/or primary graphics processor 240.
For example, when graphics processor 250 is disabled 16 lanes may be allocated for primary graphics processor 240 to satisfy a particular graphics processing performance level. The graphics processing performance level may be quantified as a specific frame rate, primitives rendered per second, texture rendering speed, image resolution, or the like. The graphics processing performance level may also include an image quality component, such as trilinear filtered texture mapping, antialiasing, multiple light sources, or the like. The graphics processing performance level may be fixed, specified by the application, or specified by a user. When graphics processor 250 is enabled, the number of lanes allocated for primary graphics processor 240 may be resized to fewer than 16 lanes and 16 lanes may be allocated for graphics processor 250.
In some embodiments of the present invention data, such as texture maps, written to frame buffer 255 and primary frame buffer 245 by host processor 220 are broadcast to graphics processor 250 and primary graphics processor 240, respectively, rather than being separately written to frame buffer 255 and primary frame buffer 245. When the broadcast feature is used, the bandwidth consumed to transfer data to frame buffer 255 and primary frame buffer 245 is effectively halved. Therefore, the dynamic power consumption is reduced when the bandwidth feature is used. Reducing the bandwidth between host processor and each of graphics processor 250 and primary graphics processor 240 may also improve system performance as well as graphics processing performance. Furthermore, when additional graphics processors, also connected to primary graphics processor 240 via graphics interface 248, are included in system 200 the broadcast feature further reduces the dynamic power consumption compared with separately transferring data to each of the additional graphics processors.
Integrating primary graphics processor 340 within integrated switch 360 may result in a reduction in power consumption due to the elimination of an external interface including the I/O drivers between integrated switch 360 and primary graphics processor 340. In some embodiments of the present invention, graphics interface 348 directly coupling graphics processor 350 to primary graphics processor 340 may be omitted and graphics data may be transferred between graphics processor 350 and primary graphics processor 340 via interface 351 and connections within integrated switch 360. In those embodiments power consumption by graphics interface 348 is eliminated.
System 300 may also use the broadcast feature and dynamic lane resizing, as previously described, to further reduce power consumption. Graphics driver 305 enables or disables graphics processor 350 responsive to a change in a low power condition, as described in conjunction with
Interfaces 451 and 441 correspond to interfaces 251 and 241, respectively. In some embodiments of the present invention, graphics interface 448 which directly couples graphics processor 450 to primary graphics processor 440 may be omitted and graphics data may be transferred between graphics processor 450 and primary graphics processor 440 via interface 451, interface 441, and switch 460. In those embodiments there would be no power consumption due to graphics interface 448.
System 400 may also use the broadcast feature and dynamic lane resizing to further reduce power consumption. Graphics driver 405 enables or disables graphics processor 450 responsive to a change in a low power condition, as described in conjunction with
In alternate embodiments of the present invention, the graphics processors may be replaced with other types of processors, such as audio processors, multi-media processors, or the like. Likewise, the graphics drivers may be replaced with other drivers corresponding to the other types of processors. Just as the graphics processing performance and power consumption for a computing system may be optimized to deliver the highest possible graphics performance for the lowest possible power consumption, processing performance for other types of data and power consumption for a computing system may be optimized.
After completing step 510 or step 520, the graphics driver returns to step 500. In an alternate embodiment of the present invention that includes multiple high performance graphics processors such as graphics processor 250, 350, 365, or 450, the graphics driver disables or enables a number of the graphics processors dependent on low power threshold values. The low power threshold values may be fixed or programmable and each one controls enabling or disabling of a specific number of the multiple high performance graphics processors.
If, in step 550 the graphics driver determines that power is not detected, then it proceeds to step 570. If, in step 550 the graphics driver determines that power is detected, i.e. a supplemental power supply is provided, then, in step 555 the graphics driver enables one or more high performance graphics processors, such as graphics processor 250, 350, 365, or 450 within a computing system, such as system 200, 202, 300, or 400. The graphics driver then returns to step 530.
In step 540 the graphics driver determines if a primary power level, e.g. battery supplied power, is below a low power threshold, and, if so, then in step 570 the graphics driver disables the one or more high performance graphics processors within the computing system. When the primary power level is below the low power threshold a low power condition exists. If, in step 540 the graphics driver determines the primary power level is not below the low power threshold, then a low power condition does not exist and in step 555 the graphics driver enables one or more high performance graphics processors within the computing system. The graphics driver returns to step 530 after completing step 555 or step 570. In other embodiments of the invention other low power conditions may be defined and detected by the graphics driver.
The graphics processing performance and power consumption for a computing system may be optimized to deliver the highest possible graphics performance for the lowest possible power consumption. When multiple graphics processing units are included within a computing system, particularly a portable system such as a laptop computer, palm-sized computer, tablet computer, game console, cellular telephone, hand-held device, or the like, one or more of the multiple graphics processing units may be enabled or disabled as needed to provide deliver a particular graphics processing performance or to adapt to a particular power environment.
The invention has been described above with reference to specific embodiments. Persons skilled in the art will recognize, however, that various modifications and changes may be made thereto without departing from the broader spirit and scope of the invention as set forth in the appended claims. Specifically, the methods and systems described may be used for processing data other than graphics data where the data is used by processors in a multi-processing data processing system. The foregoing description and drawings are, accordingly, to be regarded in an illustrative rather than a restrictive sense. The listing of steps in method claims do not imply performing the steps in any particular order, unless explicitly stated in the claim.
All trademarks are the respective property of their owners.
Number | Name | Date | Kind |
---|---|---|---|
3492538 | Fergusson | Jan 1970 | A |
4647123 | Chin et al. | Mar 1987 | A |
5448655 | Yamaguchi | Sep 1995 | A |
5502808 | Goddard et al. | Mar 1996 | A |
5522027 | Matsumoto et al. | May 1996 | A |
5784628 | Reneris | Jul 1998 | A |
5794016 | Kelleher | Aug 1998 | A |
5936640 | Horan et al. | Aug 1999 | A |
5999198 | Horan et al. | Dec 1999 | A |
6023281 | Grigor et al. | Feb 2000 | A |
6111757 | Dell et al. | Aug 2000 | A |
6191800 | Arenburg et al. | Feb 2001 | B1 |
6195734 | Porterfield | Feb 2001 | B1 |
6296493 | Michiya | Oct 2001 | B1 |
6326973 | Behrbaum et al. | Dec 2001 | B1 |
6329996 | Bowen et al. | Dec 2001 | B1 |
6473086 | Morein et al. | Oct 2002 | B1 |
6501999 | Cai | Dec 2002 | B1 |
6535939 | Arimilli et al. | Mar 2003 | B1 |
6631474 | Cai et al. | Oct 2003 | B1 |
6633296 | Laksono et al. | Oct 2003 | B1 |
6683614 | Walls et al. | Jan 2004 | B2 |
6711638 | Wu | Mar 2004 | B1 |
6750870 | Olarig | Jun 2004 | B2 |
6760031 | Langendorf et al. | Jul 2004 | B1 |
6882346 | Lefebvre et al. | Apr 2005 | B1 |
6902419 | Conway et al. | Jun 2005 | B2 |
6919896 | Sasaki et al. | Jul 2005 | B2 |
7030837 | Vong et al. | Apr 2006 | B1 |
7176847 | Loh | Feb 2007 | B2 |
7184003 | Cupps et al. | Feb 2007 | B2 |
20020047851 | Hirase et al. | Apr 2002 | A1 |
20020105523 | Behrbaum et al. | Aug 2002 | A1 |
20020118201 | Mukherjee et al. | Aug 2002 | A1 |
20030128216 | Walls et al. | Jul 2003 | A1 |
20030137483 | Callway | Jul 2003 | A1 |
20040072460 | Conway et al. | Apr 2004 | A1 |
20040104913 | Walls et al. | Jun 2004 | A1 |
20050017980 | Chang et al. | Jan 2005 | A1 |
20050088445 | Gonzalez et al. | Apr 2005 | A1 |
20050134588 | Aila et al. | Jun 2005 | A1 |
20050160212 | Caruk | Jul 2005 | A1 |
20050278559 | Sutardja et al. | Dec 2005 | A1 |
Number | Date | Country |
---|---|---|
0428277 | May 1991 | EP |
0571969 | Dec 1993 | EP |
2834097 | Jun 2003 | FR |
2839563 | Nov 2003 | FR |
5324583 | Dec 1993 | JP |
328392 | Mar 1998 | TW |
570243 | Jan 2004 | TW |
WO 03083636 | Oct 2003 | WO |