Embodiments of the invention relate to the field of power management, in particular, to scaling power consumption by a graphics controller based on events such as demand and load for example.
Over the last few years, there have been many advances in semiconductor technology which have resulted in the development of improved graphic controllers operating at higher frequencies and supporting additional and/or enhanced features. While these advances have enabled hardware manufacturers to design and build faster and more sophisticated graphics cards and computers, they also pose a disadvantage to battery-powered laptop and handheld computers. In particular, these battery-powered computers consume more power and dissipate more heat as a by-product than those past generation computers.
Within a graphics memory controller hub for example, a graphic core is one of its major functional blocks having a large gate count. Hence, power consumption by the graphics memory controller hub is primarily correlated to the voltage and frequency applied to the graphics core. Namely, as the graphics core voltage increases, the power consumed by the graphics memory controller hub increases as well. Since utilization of the graphics core can vary significantly from application to application, computers are unnecessarily wasting power when supplying high voltages and frequency signaling to the graphics core to process applications having minimal graphics. This will unnecessarily reduce battery life of laptop and hand-held computers as well as cause these computers to operate at unnecessarily high temperatures.
The invention may best be understood by referring to the following description and accompanying drawings that are used to illustrate embodiments of the invention.
In general, various embodiments of the invention describe a apparatus and method for controlling power consumption by an electronic device through both voltage and frequency adjustment. As one embodiment, this voltage and frequency control is applied to a graphics memory controller hub (GMCH).
The following detailed description is presented largely in terms of block diagrams and flowcharts to collectively illustrate embodiments of the invention. Well-known circuits or process operations are not discussed in detail to avoid unnecessarily obscuring the understanding of this description.
Certain terminology is used to describe certain features of the invention. For example, a “computing device” may be any electronic product having a graphics memory controller hub such as a computer (e.g., desktop, laptop, hand-held, server, mainframe, etc.), or perhaps a set-top box, consumer electronic equipment (e.g., television), game console, or the like.
Normally, the computing device comprises internal logic, namely hardware, firmware, software module(s) or any combination thereof. A “software module” is a series of instructions that, when executed, performs a certain function. Examples of a software module include an operating system, an application, an applet, a program or even a routine. One or more software modules may be stored in a machine-readable medium, which includes but is not limited to an electronic circuit, a semiconductor memory device, a read only memory (ROM), a flash memory, a type of erasable programmable ROM (EPROM or EEPROM), a floppy diskette, a compact disk, an optical disk, a hard disk, or the like. The terms “logic High” and “asserted” (or any tense thereof) means placement of a signal into a first state, perhaps above or below a certain voltage. The terms “Logic Low” and “deasserted” (or any tense thereof) means placement of a signal into a new state different than the first state.
I. General Architecture
Referring to
Herein, processor 110 may be a microprocessor, a digital signal processor (DSP), an application specific integrated circuit (ASIC), a microcontroller or the like. GMCH 140 is coupled to processor 110 via bus 150 (e.g., front-side bus) to receive information to be processed and subsequently stored in memory 130 or displayed on a display unit 180 associated with computing device 100. Display unit 180 may be an integral component of computing device 100 or a peripheral device separate from and external to computing device 100 as shown.
Clock generator 120 is situated internally within computing device 100. However, it is contemplated that clock generator 120 may be located external to computing device 120. Clock generator 120 provides a first clock (HOST_CLK) signal 190 to processor 110 and one or more clock signals to GMCH 140. For example, the HOST_CLK signal 190 and a secondary clock (CLK2) signal 195 may be supplied to GMCH 140, where the CLK2 signal 195 has a lower frequency than the HOST_CLK signal 190. In one embodiment, HOST CLK and CLK2 signals 190 and 195 may have frequencies of approximately 66 megahertz (66 MHz) and 48 MHz, respectively.
II. Embodiments of the Graphics Memory Controller Hub
Referring now to
As shown in
Frequency divider circuit 220, when implemented, adjusts the frequency of HOST_CLK signal 190 and passes the adjusted clock signal 265 to Core PLL circuit 255. Core PLL circuit 255 generates a programmable, rendering clock (CRCLK) signal 270 based on the incoming adjusted clock signal 265 and provides the CRCLK signal 270 to graphics core 200 for clocking purposes.
Referring still to
It is contemplated that activity control circuit 235 may be configured to control other operational behaviors of the computing device besides clocking frequency or applied voltage to graphics core 200. Examples include clock speed ratios, clock throttling percentages, refresh rates, backlight brightness and the like. However, for illustrative purposes only, frequency and adjustment is discussed.
If core PLL circuit 255 only supports two different clock frequencies, state sequencer 240 provides an asserted control signal 241 to Core PLL circuit 255 to select the higher frequency clock signal (referred to as the “‘fast’ frequency signal”). Otherwise, a deasserted control signal is provided to Core PLL circuit 255 to select the lower frequency clock signal (referred to as the “‘slow’ frequency signal”). If Core PLL circuit 255 supports more than two different clock frequencies, state sequencer 240 may be adapted to provide multiple control signals that corresponds to one of a plurality of clock frequencies for graphics core 200. For example, two control signals (00, 01, 10, 11) may support four different clocks of varying frequencies.
Upon altering the frequency of graphics core 200, state sequencer 240 also provides a control signal 242 to voltage regulator control circuit 250, which signals the voltage regulator 170 of
Referring to
III. An Embodiment of an Activity Control Circuit
A. Exemplary Logic of the Activity Control Circuit
Referring now to
In one embodiment of the invention, in order to maintain a balance between power consumption and performance, activity control circuit 235 supports frequency switching of the CRCLK signal 270 used by the GMCH 140 of
In particular, for one embodiment of the invention, the CRCLK signal 270 is configured to switch from a “fast” frequency (Ff) to a “slow” frequency (Fs, where Ff>Fs) when a specific threshold (Tf2s) of idleness is met. This threshold, referred to as a “fast-to-slow (F2S) state threshold” may be static or programmably set during a power-up condition by the Basic Input Output System (BIOS) of the computing device or perhaps by accessing contents of a particular memory location or register. This threshold may be represented as a bit value (e.g., 32-bit value) as presented in
The CRCLK signal is also configured to switch from slow to fast frequencies when the level of activity increases so as to exceed a specific activity threshold, referred to as a “slow-to-fast (S2F) state threshold” (Ts2f). Similarly, the S2F state threshold may be preset and represented by a bit value (e.g., 32-bit value). Only the CRCLK signal frequency can change dynamically on demand, all other clocks will remain unchanged after boot.
Besides measured idleness, other frequency switching events may be triggered through software control. For example, the discontinuation of AC power (e.g., disconnection of a connector line through removal of its AC plug from a power socket) can be detected and cause a software routine to switch the CRCLK signal to a slower frequency setting for longer battery life.
To reduce unnecessary switching back and forth under the same load, hysteresis can be provided. One way that this can be accomplished is by ensuring that the level of busyness required to initiate a high-to-low freqency transition is substantially lower than the level of busyness it takes to initiate a low-to-high frequency transition. The relationship between frequency, system power, and hysteresis is given by equation (1):
As shown in
More specifically, “DSTATE” signifies the desired frequency state level for the electronic device. The value of DSTATE may be stored as a bit of an Idle Status Page (ISP) register 390 (e.g., ISP[1]), perhaps located in memory (not shown) of the GMCH as shown in
When determined by hardware, in response to setting of a Hardware Enable Transition bit of the Idle Control and Status (ICS) register 395 of
In one embodiment of the invention, “CSTATE” signifies the current state of the CRCLK signal. The value of CSTATE may be stored as a bit of the ISP register 390 (e.g., ISP[0]). CSTATE is asserted to select the CRCLK signal at the “fast” frequency. CSTATE is deasserted to select the “slow” frequency. After all domain activity is stalled, the frequency of CRCLK will be switched (from fast to slow for example) by assigning the DSTATE value as the CSTATE value.
When asserted, “HW_CH_FREQ” signifies a hardware determined change frequency event (“fast” to “slow” or “slow” to “fast”). HW_CH_FREQ is based on idle monitor operations. For instance, when the CRCLK signal is operating at the fast frequency (CSTATE=logic “High”) and the sampled idle count is greater than the F2S state thresold (Tf2s), HW_CH_FREQ is asserted (HW_CH_FREQ=logic “High”). When the is operating at the slow frequency (CSTATE=logic “Low”) and the sampled idle count is less than the S2F state threshold (Ts2f), HW_CH_FREQ is asserted.
“CH_FREQ_REQ” is generally based on either hardware or software prompted events and thermal readings. When CH_FREQ_REQ is asserted, it signifies a change frequency request (can be from fast to slow or from slow to fast) made to the frequency switching unit 380. The CH_FREQ_REQ value is determined in response to two criterion. The first criterion determines whether hardware or software request will be served. Such determination is based on the EN_HW_TRAN value (EN_HW_TRAN is asserted when a hardware determined change frequency request will be served).
The second criterion determines whether a change frequency request is needed based on the sensed temperature (TRR[7:0]) of the GMCH, which may be stored in the ISP register 390 (e.g., ISP[9:2]). For this embodiment of the invention, a THERMALHOT parameter is set to logic “1” when the sensed temperature is above a given threshold. However, even if the sensed temperature is above a given threshold, the CRCLK signal is still permitted to change from a “fast” frequency. Otherwise, where CSTATE is deasserted, a SWITCH_TO_SLOW_IF_HOT (e.g., IC[29]) and THERMALHOT parameter is asserted, which causes CH_FREQ_REQ to be deasserted, signifying no change in frequency level. In summary, the second criterion is used to prevent a slow-to-fast frequency switch transition if the sensed temperature of the GMCH is above a set threshold.
As shown in
For example, as shown in
Referring back to
Idle monitor 310 comprises an idle counter 320 and an interval counter 340. Idle counter 320 includes a plurality of inputs 321–323. For this embodiment of the invention, a Clear (CLR) input 321 causes idle counter 320 to be reset upon receipt of a reset signal (IMrst) at power-up. The Clock (CLK) input 323 allows the idle counter 320 to be clocked by the CRCLK signal 270. The Enable (EN) input 322, upon receipt of an asserted signal, causes the idle counter 320 to begin counting. As shown, Enable input 322 is coupled to a first logic gate 370 (e.g., AND gate), which begins the counting process when the GIDLE signal 309 is asserted and an output from combinatorial logic unit 330 is asserted.
As shown, combinatorial logic unit 330 includes a second logic gate 371 (e.g., AND gate functionality) having a first input to receive a value of an idle monitor enable bit of the ICS register (ICS[31]) from software and a second input coupled to a collection of logic gates 372 and 373 (e.g., AND gate 372 and inverter 373). Combinatorial logic unit 330 outputs an active signal in response to (i) interval counter 340 not passing zero during a count-down sequence (most significant bit of interval counter 340 “IntMSG” is deaserted “0”) and (ii) the Start Count (START_CNT) signal has been asserted to begin the count sequence.
Idle counter 320 further includes an output 324 to transfer an idle count into ISP register 390. For this embodiment, the output is a binary value stored within multiple bits of the ISP register 390 (e.g., ISP[30:10]).
In addition, interval counter 340 includes a plurality of inputs 341–344. For this embodiment, interval counter 340 is clocked by CRCLK signal, which is provided to a Clock (CLK) input 341. A Load (LOAD) input 342 causes interval counter 340, upon reset, to be loaded with one of two values via Data-In (DIN) input 343. One value, namely either a fast state interval (If) 345 or a slow state interval (Is) 346, is output by a select element 360 based on the value of CSTATE 347, the current state at which GMCH's CRCLK signal is running. The Enable (EN) input 344, upon receipt of an asserted signal, causes interval counter 340 to begin counting. As shown, EN input 344 is coupled to combinatorial logic unit 330.
Once interval counter 340 counts past zero, an output (e.g., an integer value of the most significant bit “IntMSB”) 348 is asserted. This causes both idle counter 320 and interval counter 340 to be halted since the output from logic gate 371 is deasserted for this embodiment. The idle count produced by idle counter 320 is then sampled along with the values of CSTATE 347. The idle count is compared with either the S2F state threshold (Ts2f) 351 or the F2S state threshold (Tf2s) 352 output by a select element 361 controlled by CSTATE 347.
When sampled idle count exceeds or perhaps is equal to the selected state threshold value and CSTATE 347 is asserted, the output signal (MSB) 353 is deasserted and routed to a logic gate 374 (e.g., Exclusive OR “XOR” gate). The output of logic gate 374 is equivalent to the HW_CH_FREQ signal 365, namely, as shown in equation (2):
HW—CH_FREQ=CSTATE⊕MSB. (2)
If the frequency state is determined by the hardware of the GMCH, EN_HW_TRAN (ICS[30]) is asserted so that MSB 353 from subtractor 350 is output from select element 362, and thus, is equal to the value of DSTATE 391. The HW_CH_FREQ signal 365 is routed via select element 363 into select element 364.
If the sensed temperature of the GMCH, stored as a thermal value in bits the ISP register (e.g., ISP[9:2]), does not exceed a particular threshold, the CH_FREQ_REQ signal 381 is asserted and applied to frequency switching unit 380 if the CRCLK signal is currently operating at a “fast” frequency. Alternatively, the CH_FREQ_REQ signal 381 is deasserted if the is currently operating at a “slow” frequency.
If the sensed temperature exceeds a given threshold, the THERMALHOT signal is asserted and the control signal of the select element 364 is asserted. Thus, if the current frequency of the CRCLK signal is at a “slow” frequency level (CSTATE=logic “Low”), the CH_FREQ_REQ signal is deasserted. However, if the current frequency of the CRCLK signal is at a “fast” frequency level (CSTATE=logic “High”), the CH_FREQ_REQ signal 381 is asserted to allow for a reduction in CRCLK frequency.
If the frequency state is determined by software, EN_HW_TRAN (e.g., ICS[30]) is deasserted so that an opposite state of SWRST2S sets the value of DSTATE 349, caused by logic gate 375. Frequency switching unit 380 is then effectively set by the XOR result of SWRST2S and CSTATE as provided by logic gate 376. The same temperature sensing override is provided by logic gate 377.
B. Exemplary Operations of Activity Control Circuit
Referring now to
When START_CNT signal is asserted, both the idle and interval counters will start counting (blocks 515 and 520). GIDLE is asserted for every cycle of CRCLK signal while interval counter is counting down from the first preset time interval. Once the interval counter counts past zero, a control signal is asserted, causing both counters to be stopped (blocks 525 and 530). The idle counter value, CSTATE, DSTATE and thermal value identifying the current temperature of the GMCH are sampled (block 535).
If DSTATE is determined by hardware (e.g., EN_HW_TRAN is asserted), the value routed over the HW_CH_FREQ signal is equivalent to CSTATE ⊕ DSTATE (blocks 540 and 545). Thus, if DSTATE differs from CSTATE and the sensed temperature of the GMCH is less than a particular threshold, the CRCLK signal of the GMCH will undergo a frequency switching operation that is transparent to the user (blocks 550 and 560). However, if the sensed temperature is greater than the particular threshold, the frequency switching operation may still occur if the desired transition is to a lower frequency (blocks 550 and 555). If the desired transition is to a higher frequency, no frequency switching operation will occur (blocks 560 and 565).
If DSTATE is determined by software (e.g., EN_HW_TRAN is deasserted), the value of the software change frequency signal (SWCHFREQ) is equivalent to equation (3) as shown in block 570:
SWCHFREQ=[CSTATE ⊕ SWRST2S]#, where “#”represents an inversion of the XOR result. (3)
Thus, if the sensed temperature of the GMCH is less than a particular threshold, the CRCLK signal of the GMCH will undergo a frequency switching operation that is transparent to the user (blocks 555 and 575). However, if the sensed temperature is greater than the particular threshold, the frequency switching operation may still occur if the desired transition is to a lower frequency and SWCHFREQ is asserted (block 580). If the desired transition is to a higher frequency, no frequency switching operation will occur (block 565).
IV. Exemplary Operations of Frequency Switching Unit
Referring now to
After the frequency switching unit has been enabled, the operations of certain hardware (e.g., command parser, etc.) can be halted to allow render hardware to be idle, provided the rendering hardware temporarily continues operations until processing of the pending commands has been completed (blocks 615 and 620). Thereafter, the CRCLK signal is gated (block 625).
After the CRCLK signal has been gated, the frequency switching unit updates the value of CSTATE by assigning the value of DSTATE to CSTATE (block 630). Thereafter the CRCLK signal is ungated and the certain hardware and render engine continue operations (blocks 635 and 640). This allows the frequency switching unit to service the next Change Frequency request when initiated. Of course, in lieu of gating the CRCLK signal, smooth transitioning from one clock frequency to another may be accomplished through other means (e.g., Wait states). If the CRCLK signal is not gated, rendering is still possible during the frequency switch because it is not necessary to wait for pending commands to be completed.
Referring to
The synchronizer logic 710 enables frequency switching on the fly without the need for PLL circuit re-locks and clock glitches on the clock line. Such frequency switching occurs from a clock signal (CHCLK) having a fast frequency (Ff) to a memory clock frequency (MCLK) and from the memory clock frequency (MCLK) to a clock signal (CLCLK) having a slow frequency (Fs). In general, it provides a continuous sampling scheme to allow deterministic transfer of data between cross clocked logic.
More specifically, as shown in
Each sampling circuit samples edges of two incoming signals to generate a transmit (XMIT) signal 840 and a receive (RCV) signal 841. This may be accomplished by a first sampling circuit 811 sampling rising edges of both CHCLK signal 830 and lagging MCLK signal 831 to compute an asserted portion of XMIT signal 840. The sampling of falling edges of CHCLK and MCLK signals 830, 831 may be used to compute deasserted portion of RCV signal 841. Similarly, second sampling circuit 812 performs edge sampling of both MCLK signal 831 and lagging CLCLK signal 832.
In response to a deasserted CSTATE value, a transition occurs from a slow frequency (Fs) to an intermediary frequency associated with MCLK (Fm, where Fm>Fs). For this embodiment of the invention, the transition may be accomplished by clocking graphics core 200 with CLCLK 832 and synchronizer control logic 800 controlling the latching of data by flip-flops 861 and 862 of synchronizer 850. A first flip-flop 861 is clocked with CLCLK 832 and a second flip-flop 862 is clocked by MCLK 831. Moreover, both XMIT and RCV signals 840 and 841 control the propagation of data through select elements 870–871 from graphics core 200 to memory controller 225.
For transition from a fast frequency (Ff) to the memory controller frequency (Fm, where Ff>Fm), the transition may be accomplished by clocking first flip-flop 861 with CHCLK and second flip-flop 862 with MCLK 831. Again, XMIT and RCV control control signals 840 and 841 control the propagation of data through select elements 870 and 871 from graphics core 200 to memory controller 225. A Bypass signal 880 controlling select element 872 to allow data to bypass first flip-flop 861.
For the GMCH, multiple clock domains are being used. The clock domain frequencies vary from interface to interface with no nice ratio between these frequencies. In order to allow deterministic transfer between logic that runs at different frequency domains, synchronizer logic 710 has been developed. A multiplexer is placed in front of a flip-flop. The synchronizer control logic will sample the clock edges and generate XMIT and RCV signals based on the timing margin between CLCLK and MCLK or CHCLK and MCLK (excluding set-up time). The synchronizer 810 will then use these control signals. When a transmit is permitted, XMIT signal is asserted and the data passes through the multiplexer to the second flip-flop. The same thing applies to the receive side.
Referring now to
For instance, a first software module 910 may increase or decrease the frequency of the rendering clock (CRCLK) and voltage applied to the graphics core based on activity (e.g., idleness of the render engine). A second software module 915 may alter frequency and voltage based on battery power levels and whether the computing device is coupled to an AC power outlet. The frequency of the rendering clock is lowered in response to reduced power levels measured for one or more batteries of by computing device. A third software module 920 may alter frequency and voltage based on thermal temperatures measured within the casing surrounding logic of the computing device or measured at certain hardware components of the computing device. The frequency of the rendering clock is lowered in response to thermal readings above predetermined thermal constraints set by either the user or the manufacturer.
As shown in
Herein, for a certain embodiment of the invention, there are two general types of policies for controlling the operational behavior of the computing device: proactive and reactive. Proactive policy assumes one policy over another. For instance, if the user indicates a preference for maximum battery life over performance, the software may proactively reduce power (e.g., reduce frequency of the rendering clock) without receiving a signal from the GMCH. Reactive policy involves a response to an event such as the removal of an AC connector and balancing user preferences.
These policies can be set by the user through a graphics user interface 1000 generated by the computing device as shown in
In one embodiment of the invention, the adaptive policy is designed to conduct transitions in operation based on processed demand and trends in demand. The “demand” may be computed based on instantaneous measurements of an event (e.g., idleness, demand, temperature) as well as trends (e.g., the combination of a current data sample associated with an event along with one or more previous data samples) or historial averages. In addition, adaptive policy may include computations of the cost of making a transition (e.g., number of megabits per second gained for each watt of power).
While this invention has been described in terms of several illustrative embodiments, this description is not intended to be construed in a limiting sense. Various modifications of the illustrative embodiments, as well as other embodiments of the invention, are deemed to lie within the spirit and scope of the appended claims.
Number | Name | Date | Kind |
---|---|---|---|
4238784 | Keen et al. | Dec 1980 | A |
4825337 | Karpman | Apr 1989 | A |
4841440 | Yonezu et al. | Jun 1989 | A |
5021679 | Fairbanks et al. | Jun 1991 | A |
5134398 | Yasutake et al. | Jul 1992 | A |
5153535 | Fairbanks et al. | Oct 1992 | A |
5254992 | Keen et al. | Oct 1993 | A |
5307003 | Fairbanks et al. | Apr 1994 | A |
5369771 | Gettel | Nov 1994 | A |
5381043 | Kohiyama et al. | Jan 1995 | A |
5478221 | Loya | Dec 1995 | A |
5537343 | Kikinis et al. | Jul 1996 | A |
5550710 | Rahamim et al. | Aug 1996 | A |
5598537 | Swanstrom et al. | Jan 1997 | A |
5598539 | Gephardt et al. | Jan 1997 | A |
5603036 | Wells et al. | Feb 1997 | A |
5625829 | Gephardt et al. | Apr 1997 | A |
5627412 | Beard | May 1997 | A |
5648762 | Ichimura et al. | Jul 1997 | A |
5664118 | Nishigaki et al. | Sep 1997 | A |
5696977 | Wells et al. | Dec 1997 | A |
5721837 | Kikinis et al. | Feb 1998 | A |
5745041 | Moss | Apr 1998 | A |
5745375 | Reinhardt et al. | Apr 1998 | A |
5752011 | Thomas et al. | May 1998 | A |
5760636 | Noble et al. | Jun 1998 | A |
5781783 | Gunther et al. | Jul 1998 | A |
5798951 | Cho et al. | Aug 1998 | A |
5884049 | Atkinson | Mar 1999 | A |
5930110 | Nishigaki et al. | Jul 1999 | A |
5974556 | Jackson et al. | Oct 1999 | A |
5974557 | Thomas et al. | Oct 1999 | A |
5987614 | Mitchell et al. | Nov 1999 | A |
6018803 | Kardach | Jan 2000 | A |
6125450 | Kardach | Sep 2000 | A |
6216235 | Thomas et al. | Apr 2001 | B1 |
6275945 | Tsuji et al. | Aug 2001 | B1 |
6292201 | Chen et al. | Sep 2001 | B1 |
6397343 | Williams et al. | May 2002 | B1 |
6407595 | Huang et al. | Jun 2002 | B1 |
6460125 | Lee et al. | Oct 2002 | B1 |
6480198 | Kang | Nov 2002 | B1 |
6487668 | Thomas et al. | Nov 2002 | B1 |
6601179 | Jackson et al. | Jul 2003 | B1 |
6715089 | Zdravkovic | Mar 2004 | B1 |
6848058 | Sinclair et al. | Jan 2005 | B1 |
Number | Date | Country |
---|---|---|
0 474 963 | Mar 1992 | EP |
0 539 884 | May 1993 | EP |
0 566 395 | Oct 1993 | EP |
0 632 360 | Jan 1995 | EP |
WO 0173529 | Apr 2001 | WO |
WO 0173534 | Apr 2001 | WO |
Number | Date | Country | |
---|---|---|---|
20030210247 A1 | Nov 2003 | US |