Processor Temperature Measurement Through Median Sampling

Information

  • Patent Application
  • 20080125915
  • Publication Number
    20080125915
  • Date Filed
    November 07, 2006
    18 years ago
  • Date Published
    May 29, 2008
    16 years ago
Abstract
Temperature readings obtained within a computer system from the location of monitored circuit elements may be oversampled at least three times, and a median average of the three parameter readings rather than the arithmetic mean may be used for controlling a device, e.g. a fan, configured to regulate the environmental parameter, e.g. temperature, a the location of the monitored circuit elements. For example, when a CPU temperature reading is requested by the system comprising the CPU, a thermal monitoring system may acquire at least three consecutive temperature readings of the CPU, discard the highest temperature reading and the lowest temperature reading, and return the median reading to be used in controlling a fan configured to regulate temperature at the location of the CPU, resulting in more accurate temperature readings and more accurate fan control. In various implementations, more than three readings may be considered at a time, and running averages based on median values may be computed in a variety of ways to obtain a temperature control value to control the fan.
Description
BACKGROUND OF THE INVENTION

1. Field of the Invention


This invention relates generally to the field of temperature measurement in electronics and computer systems, and, more particularly, to the design of temperature measurement devices to obtain accurate temperature readings for controlling the rotational speed of cooling fans.


2. Description of the Related Art


Many digital systems, especially those that include high-performance, high-speed circuits, are prone to operational variances due to temperature effects. Devices that monitor temperature and voltage are often included as part of such systems in order to maintain the integrity of the system components. Personal computers (PC), signal processors and high-speed graphics adapters, among others, typically benefit from such temperature monitoring circuits. For example, a central processor unit (CPU) that typically “runs hot” as its operating temperature reaches high levels may require a temperature sensor in the PC to insure that it doesn't malfunction or break due to thermal problems. Accurate temperature measurement of processors is therefore necessary when using high performance, high current processors in PCs and/or servers.


Often, integrated circuit (IC) solutions designed to measure temperature in a system will monitor the voltage across one or more PN-junctions, for example a diode or multiple diodes at different current densities to extract a temperature value. Temperature-to-digital conversion for IC-based temperature measuring solutions is often accomplished by measuring a difference in voltage across the terminals of a diode when different current densities are forced through the PN junctions of the diode. The resulting change (ΔVBE) in the base-emitter voltage (VBE) between the diodes is generally proportional to temperature.


Some IC manufacturers began incorporating temperature monitoring and the monitoring of other possible environmental variables within the ICs themselves, oftentimes providing measurement readings to other system components. Those other system components would then typically use these readings to control selected environment variables through a variety of means. One example of such built-in monitoring is Intel Corporation's PECI (Platform Environment Control Interface), which includes a digital bus designed, among others things, to carry Intel CPU temperature readings to an environmental monitor or fan controller.


Fans are often used to evacuate warm air from enclosures in which electronic systems are contained. For example, most computer systems include one or more cooling fans to aid in circulating the air inside the enclosures and for maintaining the temperature inside the enclosures within an acceptable range. The increased airflow provided by fans typically aids in eliminating waste heat that may otherwise build up and adversely affect system operation. Employing cooling fans is especially helpful in ensuring proper operation for CPUs with relatively high operating temperatures.


Control of fans in a system typically involves a fan control unit executing a fan control algorithm. A fan control algorithm may determine the method for controlling one or more fans that are configured to evacuate warm air from a system enclosure. For example, the fan control algorithm may specify that a fan's speed should be increased or decreased dependent upon a detected temperature, which in case of processor cooling could mean the processor temperature. Such control algorithms may also involve turning off a fan if the temperature is deemed cool enough to do so. For detecting the temperature of a processor, a temperature sensor may provide to the fan control unit a signal indicative of the current temperature of the processor.


Because a fan operating at high speed can typically be noisy, it is generally desirable to run the fan as slowly as possible, while maintaining the processor temperature in a safe zone. An accurate temperature reading is therefore necessary to make sure the fan cooling is effective. However, it may be difficult to obtain accurate measurements using remote temperature readings. For example, accuracy can be a problem with Intel Corporation's PECI bus. Periodically, PECI readings may fail without an error indication, returning temperatures that diverge widely from the correct temperature. More specifically, one issue with PECI is that isolated temperature readings may be very much larger or smaller than the actual temperature.


There are at least a couple of ways in which the inaccuracy of temperature measurements for fan controllers has been addressed. One common method is to take the arithmetic mean of a number of temperature readings before returning a reading to a fan controller. However, even if the arithmetic mean of a series of readings is used, an outlier reading may still distort the temperature reading. For example, if a system was at a constant 50 degrees and there are 4 readings of 50 degrees and one reading of 100 degrees, the arithmetic mean would be 60 degrees, or 20% above the actual temperature. Fan speed would be increased to compensate, even though the increase was not truly necessary. A running average, where the arithmetic mean is used over a specified number of readings, would be affected even more, since the one outlier would continue to influence the running average for as long as any readings were included in the average.


Another method typically used to compensate for the inaccuracy of temperature measurements is to control the rate at which fan speed may change over a period of time, such that a sudden increase in apparent CPU temperature will not cause a sudden increase in fan speed. Slowing the change rate of fan speed may typically mitigate the effect of a sudden apparent temperature increase or decrease. However, an extreme reading will still cause a fan to ramp up or ramp down, and so it may still unnecessarily increase noise levels.


Other corresponding issues related to the prior art will become apparent to one skilled in the art after comparing such prior art with the present invention as described herein.


SUMMARY OF THE INVENTION

In one set of embodiments, environmental parameter readings, e.g. temperature readings, obtained from one or more monitored circuit elements and/or integrated circuits via a specifically designated bus, such as Intel's Platform Environment Control Interface (PECI), may be oversampled at least three times, and a median average of the three parameter readings rather than the arithmetic mean may be used for controlling a device, e.g. a fan, configured to regulate the environmental parameter, e.g. temperature, at the location of the circuit elements and/or integrated circuits. In one embodiment, when a CPU temperature reading is requested by the system comprising the CPU, a thermal monitoring system would acquire three temperature readings of the CPU over the PECI bus, discard the highest temperature reading and the lowest temperature reading, and return the median reading to be used in controlling a fan configured to evacuate air from the location of the CPU. Isolated outlier readings, on either the high side or the low side, may thereby be discarded, resulting in more accurate temperature readings and thus more accurate fan control.


In another set of embodiments, more than three environmental parameter readings at a time may be considered when determining the control value to be provided to control the fan, or fans. For example, a running average based on N readings (N being an integer greater than 2) may be calculated after the highest and lowest readings within the group of N readings have been discarded. In another set of embodiments, as temperature readings are obtained, a median value of the current latest three readings may be selected and stored until a specified number of median values have been thus obtained. A running average value of this specified number of median values may then be computed and used for controlling a fan or fans to evacuate air from the location of the monitored circuit.


Various embodiments may be implemented either in hardware, as control logic or finite state machines, or in software, which may be executed by a processor or controller.





BRIEF DESCRIPTION OF THE DRAWINGS

The foregoing, as well as other objects, features, and advantages of this invention may be more completely understood by reference to the following detailed description when read together with the accompanying drawings in which:



FIG. 1 shows a flowchart for a method to perform median sampling on temperature readings according to one embodiment;



FIG. 2 shows a block diagram of one embodiment of a system that incorporates temperature measurement;



FIG. 3 shows a logic diagram of one embodiment of the flowchart shown in FIG. 1; and



FIG. 4 shows three tables of examples of temperature readings and various arithmetic mean and median values used for obtaining a temperature control value for controlling a fan.





While the invention is susceptible to various modifications and alternative forms, specific embodiments thereof are shown by way of example in the drawings and will herein be described in detail. It should be understood, however, that the drawings and detailed description thereto are not intended to limit the invention to the particular form disclosed, but on the contrary, the intention is to cover all modifications, equivalents, and alternatives falling within the spirit and scope of the present invention as defined by the appended claims. Note, the headings are for organizational purposes only and are not meant to be used to limit or interpret the description or claims. Furthermore, note that the word “may” is used throughout this application in a permissive sense (i.e., having the potential to, being able to), not a mandatory sense (i.e., must).” The term “include”, and derivations thereof, mean “including, but not limited to”. The term “coupled” means “directly or indirectly connected”.


DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS


FIG. 1 shows one embodiment of a method for oversampling environmental parameter readings, e.g. temperature readings, obtained within a system, e.g. a computer system, from one or more monitored circuit elements and/or integrated circuits via a specifically designated bus, such as Intel's Platform Environment Control Interface (PECI). While in operation, the system may be instructed to acquire the next temperature reading (102), and may further be instructed whether the reading is a PECI reading (104). Note that what is referred to in this case as PECI readings may generally refer to readings generated via means that may not be under direct control of the monitoring system, and may therefore return values that may be erroneously much higher or much lower than the actual value of the environmental parameter (e.g. temperature) that the readings are meant to convey. Accordingly, the system may be instructed that the reading is not such a reading (e.g. not a PECI reading), but rather, in case of temperature monitoring, a temperature reading possibly obtained directly from a diode configured at the location of the monitored circuit elements/integrated circuit(s) specifically for obtaining temperature readings. Various means and methods for obtaining temperature readings from diodes are well known in the art and will not be discussed here in detail.


In case the system is instructed to obtain a non-PECI reading, the non-PECI reading, for example a temperature reading from a diode, may be obtained (108), and that value may be used to control a device, e.g. a fan, configured to regulate the temperature at the location of the monitored circuit elements/integrated circuit (116). The system may then be instructed to get a next temperature reading at a specified time or after a specified time period has elapsed (102).


In case the system is instructed to obtain a PECI reading, which may be provided by a central processing unit (CPU) configured to provide PECI temperature readings of the CPU temperature, three consecutive readings may be acquired and stored (106, 110, 112). The median value of the three stored temperature readings may then be determined (114) and provided as the temperature value to use for controlling the fan configured to regulate the temperature at the CPU's location (116). As used here, median value refers to the value from the three temperature reading values that is less than or equal to one of the other two values and greater than or equal to the other one of the other two values. For example, the median value of three temperature readings comprising 54, 53, and 80, the median value would be 54. In other words, the highest of the three temperature reading values and the lowest of the three temperature reading values may be discarded, leaving the only remaining temperature reading value as the median value.


It should be noted that the method illustrated in FIG. 1 may be modified to obtain a control temperature value in a variety of other ways, all based on the use of median values. For example, in another set of embodiments, more than three temperature parameter readings may be considered at a time. Thus, additional values may be acquired following the acquisition and storage of the third CPU reading (event 112), and a running average value may be obtained using median values. For example, a running average based on N readings (N being an integer greater than 2) may be calculated after the highest and lowest readings within the group of N readings have been discarded. In another set of embodiments, as temperature readings are acquired, a median value may be obtained from the latest three readings for each new acquired reading, and stored until a specified number of median values have been thus obtained. A running average value of this specified number of median values may then be computed and used as the temperature control value for controlling a fan or fans.


To illustrate the principle of the method shown in FIG. 1, FIG. 4 shows a Table 1 containing a sequence of—in this case—PECI readings obtained from a CPU (note that the sequence may have been obtained over a bus/transmission medium other than a PECI bus and from a monitored device other than a CPU, and the principles discussed herein would equally apply to readings acquired via alternative methods). As seen in Table 1, temperature reading at reading time 6 is likely an outlier that may be the result of a faulty temperature reading relayed to the temperature control system/circuit, in this case a faulty PECI reading. Thus, the temperature reading at reading time 6 does not represent the actual temperature of the CPU. If a fan were to be controlled according to the readings shown in Table 1, it would be forced to spin up to full speed following the acquisition of the temperature reading at reading time 6, to cool the apparently overheated processor, with an accompanying increase in fan noise.


Table 2 in FIG. 4 shows which temperature readings may be used if at a given reading time, instead of a single reading, at least three readings are performed, and the median value of the three readings is used for controlling the fan. A spurious 100 degree reading at time 6 may now be ignored, and the temperature readings used to control the fan may stay much closer to their likely true values. A spurious low reading of 29 at time 9 may also be similarly ignored. It should also be noted at this time that as shown in Table 2, a new set of readings is performed for each request given to the system for obtaining a temperature measurement (event 102 in FIG. 1). However, a running number of current temperature readings may also be stored if temperature readings are to be frequently performed, and a median value of the currently stored three temperature readings may be used to control the fan. For example, a specified number of readings may be stored and the stored values may be treated as a queue, where for each new reading an oldest reading may be replaced or overwritten by the new reading. The lowest and highest value readings from among the presently stored readings could then be discarded, and the temperature control value may be determined from the remaining presently stored values. Thus, a new temperature control value could be provided upon having acquired each new temperature reading.


A running average of temperature readings could also be improved by following similar procedures. Using the values in Table 1 of FIG. 4 as an example, Table 3, also shown in FIG. 4, may be constructed. Table 3 shows how a running average of 5 values compares with using a running average of the 3 median values out of a group of 5 values. A running group may indicate a group of readings where a specified number of temperature readings, in this case 5 readings, are stored, and for each new temperature reading the oldest temperature reading is replaced by the new temperature reading (as also previously described above). The third column in Table 3 shows the current running group of readings at each given reading time. The fourth column shows the running average calculated from the current running group of readings for each given reading time. The fifth column shows the 3 median values selected from the current running group of readings for each given reading time. Finally, the sixth column shows the running average calculated from the 3 median values for each given reading time. For example, as indicated in Table 3, the running average of temperature readings for reading times 6 through 10 have values of over 60 degrees, or about 20% over the actual readings of around 50 degrees, due to the contribution of the spurious 100 degree reading at time 6. However, the median average values shown in the last column of Table 3 do not reflect any effects of the spurious reading.



FIG. 2 shows a computer system that may be configured to perform temperature monitoring and/or control by utilizing PECI bus 212. CPU 202 may be configured to provide temperature readings over PECI bus 212 to environmental monitoring device 208, which may be a circuit and/or logic block and/or controller configured to receive the temperature readings provided by CPU 202. A typical computer system may also include a North Bridge 204, a South Bridge 206. The system may also feature an analog sensor circuit/block 210, which may be configured to provide direct temperature readings to monitoring device 208 in addition to the temperature readings provided by CPU 202 over PECI bus 212. For example, analog sensor block 210 may comprise temperature diodes configured obtain temperature readings based on the change in junction voltage across the diode channel. Analog sensor block 210 may equally include any other device similarly configured to provide temperature readings.


The various embodiments of the method outlined above and shown in FIG. 1 may be implemented either fully in hardware or fully in software or as a combination of both. For example, in one set of embodiments, monitoring device 208, which may specifically be a temperature monitoring device, may include a controller or processor operable to execute an algorithm based on the flowchart shown in FIG. 1. In alternative embodiments, monitoring device 208 may comprise logic circuitry designed to implement the flowchart shown in FIG. 1.



FIG. 3 shows one possible hardware implementation of monitoring device 208. In this embodiment, temperature monitoring device 208 includes a digital temperature interface (DTI) 316 configured to interface with PECI bus 212 to receive temperature readings from CPU 202. DTI 316 may further be configured to route and latch the value of each one of three consecutive temperature readings received from CPU 202 into latches 310, 312 and 314, respectively. The values may each respectively be provided by their corresponding latches (310, 312 and 314) as value inputs to multiplexer 320. The select signals for multiplexer 320 may be generated according to the median value of the three values stored in latches 310, 312 and 314, respectively. Comparator 332 may compare the contents of latches 310 and 312, comparator 334 may compare the contents of latches 312 and 314, and comparator 336 may compare the contents of latches 310 and 314. The outputs from comparators 332, 334 and 336 may then be used to form select signals Select A, Select B and Select C, through logic gates 322. The result 330 provided by multiplexer 320 may therefore reflect the median value of the three values stored in latches 310, 312 and 314. Result 330 may be provided to a control circuit configured to control the rotational speed of a fan, or any other device configured to regulate temperature or any other environmental variable at the location of CPU 202.


Further modifications and alternative embodiments of various aspects of the invention may be apparent to those skilled in the art in view of this description. Accordingly, this description is to be construed as illustrative only and is for the purpose of teaching those skilled in the art the general manner of carrying out the invention. It is to be understood that the forms of the invention shown and described herein are to be taken as embodiments. Elements and materials may be substituted for those illustrated and described herein, parts and processes may be reversed, and certain features of the invention may be utilized independently, all as would be apparent to one skilled in the art after having the benefit of this description of the invention. Changes may be made in the elements described herein without departing from the spirit and scope of the invention as described in the following claims.

Claims
  • 1. A method for controlling at least one fan, the method comprising: acquiring three consecutive parameter values, wherein the three consecutive parameter values correspond to three consecutive parameter readings;selecting a first parameter value of the three consecutive parameter values, wherein the first parameter value is less than or equal to a second parameter value of the three consecutive parameter values, and greater than or equal to a third parameter value of the three consecutive parameter values; andusing the first parameter value to control a rotational speed of the at least one fan.
  • 2. The method of claim 1, further comprising performing said acquiring, said selecting, and said using a plurality of times.
  • 3. The method of claim 1, further comprising generating a control signal based on the first parameter value and using the control signal to control the rotational speed of the at least one fan.
  • 4. The method of claim 1, further comprising: acquiring additional consecutive parameter values, wherein the additional consecutive parameter values correspond to additional consecutive parameter readings subsequent to a last one of the three consecutive parameter readings, wherein a last two of the three consecutive parameter values and the additional consecutive parameter values form a plurality of consecutive parameter values;obtaining a plurality of median values by performing the following for each consecutive three parameter values of the plurality of consecutive parameter values:selecting one of the consecutive three parameter values as a median value, wherein the median value is less than or equal to a first remaining one of the consecutive three parameter values, and greater than or equal to a second remaining one of the consecutive three parameter values, wherein the median value becomes one of the plurality of median values; andusing the median value to control the rotational speed of the at least one fan.
  • 5. The method of claim 4, further comprising: for each consecutive N median values of the plurality of median values, calculating an average median value of the consecutive N median values; andusing the average median value to control the rotational speed of the at least one fan;wherein N is an integer.
  • 6. The method of claim 5, further comprising generating a control signal based on the average median value and using the control signal to control the rotational speed of the at least one fan.
  • 7. The method of claim 6, wherein the control signal is a duty cycle of a PWM (Pulse Width Modulated) signal used for powering the at least one fan.
  • 8. The method of claim 4, wherein the three consecutive parameter readings and the additional consecutive parameter readings comprise environmental parameter readings.
  • 9. The method of claim 1, wherein the three consecutive parameter values are three consecutive temperature values and the three consecutive parameter readings are three consecutive temperature readings.
  • 10. The method of claim 9, wherein the three consecutive temperature readings are of a temperature of a specified temperature region within a computer system.
  • 11. The method of claim 10, wherein the specified temperature region corresponds to a CPU (Central Processing Unit).
  • 12. A system for obtaining an environmental reading used in controlling at least one fan, the system comprising: a digital interface device operable to acquire at least three consecutive parameter readings and generate at least three parameter values, each one of the at least three parameter values corresponding to a respective one of the at least three consecutive parameter readings; andlogic circuitry coupled to the digital interface and operable to select a first parameter value from the at least three parameter values, wherein the first parameter value is less than or equal to a second parameter value of the at least three parameter values, and greater than or equal to a third parameter value of the at least three parameter values;wherein the logic circuitry is operable to provide the first parameter value to a control circuit configured to control a rotational speed of the at least one fan according to at least the first parameter value.
  • 13. The system of claim 12; wherein the digital interface is operable to acquire additional parameter readings, wherein the at least three consecutive parameter readings and the additional parameter readings comprise a plurality of consecutive parameter readings;wherein the digital interface is operable to generate a plurality of parameter values, each parameter value of the plurality of parameter values corresponding to a respective one the plurality of consecutive parameter readings;wherein for each current parameter reading of the plurality of parameter readings the logic circuitry is operable to select a median parameter value from:two consecutive parameter values corresponding to a most recent two previous consecutive parameter readings of plurality of parameter readings; anda current parameter value corresponding to the current parameter reading; andwherein the logic circuitry is operable to provide the median parameter value to the control circuit configured to control the rotational speed of the at least one fan according to at least the median parameter value.
  • 14. The system of claim 13, wherein the logic circuitry is operable to store at least three of the plurality of parameter values at a time.
  • 15. The system of claim 14, wherein the logic circuitry comprises at least three latches, each one of the at least three latches operable to hold a different respective one of the at least three of the plurality of parameter values.
  • 16. The system of claim 13, wherein the plurality of consecutive parameter readings comprise a plurality of consecutive temperature readings.
  • 17. The system of claim 12, wherein the digital interface is operable to acquire the at least three consecutive parameter readings over a PECI (Platform Environment Control Interface) bus.
  • 18. A system comprising: a bus;a first circuit operable to provide environmental parameter readings over the bus, wherein the environmental parameter readings are indicative of at least one environmental variable at the first circuit's location;a first device configured to control the at least one environmental variable at the first circuit's location; anda second circuit operable to receive the environmental parameter readings, wherein for each consecutive three readings of the environmental parameters readings, the second circuit is operable to:provide a first one of the consecutive three readings to the first device to control the first device, thereby controlling the at least one environmental variable according to the environmental parameter readings;wherein the first one of the consecutive three readings is less than or equal to a second one of the consecutive three readings and greater than or equal to a third one of the consecutive three readings.
  • 19. The system of claim 18, wherein the second circuit is further operable to: generate three parameter values, wherein each one of the three parameter values corresponds to a respective one of the consecutive three readings; andprovide a first one of the three parameter values to the first device to control the first device, thereby controlling the at least one environmental variable according to the environmental parameter readings;wherein the first one of the parameter values is less than or equal to a second one of the three parameter values and greater than or equal to a third one of the three parameter values.
  • 20. The system of claim 18, wherein the environmental parameter readings are temperature readings, wherein the first device is a fan, and wherein in controlling the first device, the second circuit is operable to control a rotational speed of the fan.
  • 21. The system of claim 19, wherein the first circuit is a central processing unit, and wherein the bus is a PECI (Platform Environment Control Interface) bus.
  • 22. A method for controlling at least one device configured to control at least one environmental variable in a computer system, the method comprising: (a) acquiring a plurality of environmental parameter readings;for each consecutive N readings of the plurality of environmental parameter readings:(b) discarding a first one of the consecutive N readings having a lowest value of the N consecutive readings;(c) discarding a second one of the consecutive N readings having a highest value of the N consecutive readings;(d) using remaining ones of the consecutive N readings to obtain a control value to control the at least one device, thereby controlling the at least one environmental variable;wherein N is an integer greater than 2.
  • 23. The method of claim 22, wherein (b), (c), and (d) are performed substantially concurrently with (a), after a first N consecutive readings of the plurality of environmental parameter readings have been acquired.
  • 24. The method of claim 22, wherein (d) comprises: setting the control value to an average value of the remaining ones of the consecutive N readings; andusing the average value to control the at least one device.
  • 25. The method of claim 22, wherein the at least one environmental variable is temperature, wherein the plurality of environmental parameter readings comprise temperature readings, and wherein the at least one device is a fan.
  • 26. The method of claim 22, further comprising: for each consecutive N readings of the plurality of environmental parameter readings, storing the consecutive N readings prior to performing (b), (c) and (d).