The present invention relates in general to cooling assemblies and other apparatus used for removing heat from electronic devices, modules and systems. More particularly, this invention relates to methods and systems for monitoring a rate of volume change of coolant within a cooling system designed, for example, to cool one or more electronics subsystems, such as electronics racks, of a computing environment.
Heat flux dissipated by electronic equipment, such as microprocessors and power supplies, is again reaching levels that require cooling other than simple air cooling as a means of controlling component temperature. Liquid cooling (e.g., water cooling) is an attractive technology to manage these higher heat fluxes. The liquid absorbs the heat dissipated by the component/modules in an efficient manner, i.e., with minimal temperature rise from the liquid to the component being cooled. Typically, the heat is ultimately transferred from the liquid out into the outside environment. Otherwise, the liquid coolant would continuously rise in temperature.
From the 1970's through the early 1990's, International Business Machines Corporation accomplished this task by circulating cooling liquid via a coolant distribution unit which was a single, large computer room water conditioning unit (CRWCU). The CRWCU distributed conditioned chilled water to the various electronics racks of a mainframe computer system to be cooled. Conventionally, the electronics racks of the mainframe computer included memory frames, processor frames, input/output frames, power frames, etc. Operationally, the CRWCU received customer chilled water which was then used to remove heat from conditioned cooled water to the individual electronics racks of the computer room.
The CRWCU included a primary cooling loop wherein building chilled water was supplied and passed through a control valve driven by a motor. The valve determined an amount of building chilled water to be passed through a heat exchanger, with a portion of the building chilled water possibly being returned directly to the return via a bypass orifice. The CRWCU further included a second cooling loop with a reservoir tank from which water was pumped either by one of two pumps into the heat exchanger for conditioning and output therefrom as a conditioned water source to the electronics racks to be cooled within the computer room. The computer room water conditioning unit normally stood separate from the electronics frames, and again, would supply system water (typically maintained at about 22° C.) to all electronics frames of the computer room.
The coolant distribution unit, and more particularly, the computer room water conditioning unit (CRWCU), contained a single heat exchanger, a single reservoir, a single control valve, and redundant pumps. Thus, in the case of a failed pump, the CRWCU would automatically switch to the redundant pump, but any other malfunction in the coolant distribution unit would have brought down the whole computer room mainframe system. For example, if the heat exchanger, or control valve, or building chilled water source failed, the entire mainframe system in the computer room would also fail. Redundant mainframe computers would have been on the computer room floor to allow continuation of processing (in a degraded mode) until the downed mainframe could be repaired.
Today, a multi-frame mainframe system such as existed in the 1970's and 1980's has been replaced with single processor frames or racks. Thus, multiple processor frames, from high end, mid-range and low end could now be sourced from a single computer room water conditioning unit. Therein lies a problem, however. Any leak in any of the processor frames could cause all of the frames to lose conditioned water. A single leak could bring down the entire computer room floor.
With today's critical demand for high availability of electronics systems, it is desirable to have a technique for monitoring not just volume of coolant within the cooling system, but also the rate of volume change of coolant within the cooling system as a means of providing early detection of a leak within the system, i.e., before coolant within the system reaches a critically low level necessitating shutting down of the cooling system, as well as the associated electronics system.
Thus, the shortcomings of the prior art are overcome and additional advantages are provided through the provision of a method of monitoring coolant within a cooling system. The method includes: employing at least one pressure transducer to obtain multiple pressure measurements related to an amount of coolant within an expansion tank of the cooling system; and determining a rate of volume change of coolant within the expansion tank employing the multiple pressure measurements.
In enhanced aspects, the employing includes obtaining multiple successive pressure measurements related to the amount of coolant within the expansion tank of the cooling system, the multiple successive pressure measurements being taken at a known time interval, and the determining includes employing the multiple successive pressure measurements at the known time interval to determine the rate of volume change of coolant within the expansion tank. The method may also include determining an immediacy of action to be taken to service the cooling system based on the rate of volume change of coolant within the expansion tank. Further, the employing could comprise obtaining multiple differential pressure measurements on the amount of coolant within the expansion tank, each differential pressure measurement including a difference in pressure between pressure in a liquid coolant portion of the expansion tank and pressure in a non-liquid portion of the expansion tank.
Systems and computer program products corresponding to the above summarized methods are also described and claimed herein.
Further, additional features and advantages are realized through the techniques of the present invention. Other embodiments and aspects of the invention are described in detail herein and are considered a part of the claimed invention.
The subject matter which is regarded as the invention is particularly pointed out and distinctly claimed in the claims at the conclusion of the specification. The foregoing and other objects, features, and advantages of the invention are apparent from the following detailed description taken in conjunction with the accompanying drawings in which:
As used herein “electronics subsystem” comprises any housing, frame, rack, compartment, etc., containing one or more heat generating components of a computer system or other electronics system requiring cooling. The term “electronics rack” includes any frame or rack having a heat generating component of a computer system or electronics system; and may be, for example, a stand alone computer processor having high, mid or low end processing capability. In one embodiment, an electronics rack may comprise multiple electronics drawers, each having one or more heat generating components requiring cooling.
One example of coolant within the coolant distribution unit is water. However, the concepts disclosed are readily adapted to use with other types of coolant on both the facility side and the system side. For example, the coolant may comprise a brine, a fluorocarbon liquid, or other similar chemical coolant or a refrigerant, while still maintaining the advantages and unique features of the present invention.
As noted briefly above, power levels in computer equipment (primarily processors) have again risen to a level where they no longer can be simply air cooled. The components will likely be water cooled. Heat dissipated by the processor can be transferred to the water via a water cooled cold plate. Water typically available at customer locations (i.e., data centers) is not suitable for use in these cold plates. First, condensation formation is a concern as the temperature of the data center water, ranging from 7° C. to 15° C., is far below the room dew point (typically 18-23° C.). Second, the relatively poor quality of the water (its chemistry, cleanliness, etc.) impacts system reliability. It is therefore desirable to utilize a water cooling/conditioning unit that circulates high quality water to/from the electronics to be cooled and rejects the heat to the data center water. As used herein, “facility water” or “facility coolant” refers to this data center water or coolant, while “system water” or “system coolant” refers to the cooled/conditioned water or coolant, respectively, circulating between the coolant distribution unit and the electronics subsystem(s) being cooled.
Reference is now made to the drawings, wherein the same reference numbers used throughout different figures designate the same or similar components.
Having been cooled by the facility chilled water flowing through the “cold side” of the heat exchanger (116, 117), the system coolant is sent to the supply manifold 118 which distributes the coolant to the one or more electronics racks requiring cooling. Although not shown here, the SCCU could also incorporate means to filter the system water and automatically add a corrosion inhibitor such as benzotriazole (BTA) as needed. A two-way control valve 228 is used to regulate the flow rate of the facility chilled water to the heat exchanger within the integral heat exchanger/expansion tank 223, thereby controlling the temperature of system coolant delivered to the electronics racks. A thermistor temperature sensing element (not shown) can be located at the inlet of the system coolant supply manifold 118 to supply an electronic signal to the power/controller controlling operation of valve 228. If the system coolant temperature is higher than desired, valve 228 can be opened more allowing an increased flow of facility water through the heat exchanger resulting in a decrease of the temperature of the system water directed to the electronics racks from supply manifold 118. Alternatively, if the system water temperature is lower than desired, valve 228 can be closed more providing a decreased flow of facility water through the heat exchanger, resulting in an increase in the temperature of the system water directed to the electronics racks from supply manifold 118.
As noted initially above, any leak in a cooling system such as depicted in
Operationally, the integrated heat exchanger/expansion tank 300 of
As the liquid level within the expansion tank varies, the water level sensor 325 sends a signal to a controller 327 when float 330 (with the magnets disposed therein) passes a reed switch at one of the high level, low level or low-low level. Note that the high level reed switch is provided to guard against a possible leak in the heat exchanger which could be discharging facility coolant directly into the expansion tank. Conversely, if there is a leak outside the expansion tank, water level within the tank will drop and the water level sensor will note the magnitude of the drop when the float passes the low level reed switch or reaches the low-low level reed switch.
If the water level drops to the low-low level, then the coolant pump(s) are shut down, and the associated electronics system is also powered down. With today's critical demand on high availability, this result is no longer tolerable, and a new mechanism is needed for sensing a leak in a cooling system before coolant in the expansion tank reaches a critically low level.
In this monitoring embodiment, the water (or more generally coolant) 420 level is determined using, for example, a hydrostatic pressure transducer mounted in the bottom of the expansion tank. The signal provided by this transducer is monitored and recorded at regular intervals by the control system microcode 440. The pressure at the bottom of the expansion tank is directly proportional to the head of water (i.e., level of water) within the tank. By taking successive pressure measurements and dividing by the time between measurements, it is possible to determine a rate of change of water volume in the tank. The control system microcode can be programmed to compare the rate of volume change against preset criteria to determine if a serious leak is present and initiate an appropriate action when needed. In addition, by sensing pressure at the bottom of the expansion tank, the control system microcode may also be used under more normal circumstances to determine if and when there is a need to add water to the tank to compensate for normal losses due to evaporation. In an enhanced embodiment, second sensor 432 is provided to allow for determination of a differential pressure measurement, which can then be used to determine a rate of volume change of coolant within the tank as explained further below in connection with
ΔP=Pb−Po=ρgh (Eq.1)
wherein:
V=Volume of liquid in tank.
A=tank cross-sectional area.
h=liquid height in tank.
Note that this example assumes a uniform cross-sectional area for the expansion tank, as shown in
By taking successive differential pressure measurements and converting the pressure measurements into change in volume, it is possible to determine a rate of volume change of liquid within the tank (or leak rate from the expansion tank), as expressed in equation 4:
Wherein:
Δt=time interval.
If the magnitude is less than the first threshold value x1, then the measured volume at time t2 is reassigned to comprise the measured volume at time t1640 and determination is made whether volume V1 is greater than a defined maximum volume Vmax 642. If “yes”, facility chilled water is leaking into the system and corrective action is required 685. Otherwise, the process repeats with a waiting of the known time interval 615 before making the next differential pressure measurement. Assuming that the condition statement 635 is not satisfied, then processing determines whether the rate of volume change of coolant within the expansion tank is between the first threshold and a second threshold value or second leak rate set point x2650. If “yes”, then a slow leak has been identified and processing determines whether the change in volume between measured time t1 and time t2 is less than zero, i.e., is volume V1 larger than volume V2. If “no”, then facility chilled water is leaking into the expansion tank and corrective action is required 660. If “yes”, then a slow leak is detected 665 (and notice thereof can be provided to an operator of the computing environment), and processing determines whether the volume of coolant within the expansion tank at time t2 is greater than a minimum allowable volume Vallowable 670. If the volume has dropped below the minimum allowable level, then the coolant distribution unit is shut down 675, otherwise monitoring continues by reassigning the measured volume at time t2 to comprise the measured volume at time t1672, and waiting for the next time interval to pass 615 before repeating the measurements.
From condition statement 650, if the rate of volume change of coolant within the expansion tank is greater than the second threshold value x2, then a fast leak has been identified, and processing determines whether the change in volume is less than zero 680, i.e., is the volume measurement at time t1 larger than the volume measurement at time t2. If so, then leak isolation protocol can be automatically initiated 690, for example, as described in the above-incorporated, co-filed patent application. Otherwise, facility chilled water is leaking into the system and corrective action is required 685.
The present invention can be included in an article of manufacture (e.g., one or more computer program products) having, for instance, computer usable media. The media has embodied therein, for instance, computer readable program code means for providing and facilitating the capabilities of the present invention. The article of manufacture can be included as a part of a computer system or sold separately.
Additionally, at least one program storage device readable by a machine embodying at least one program of instructions executable by the machine to perform the capabilities of the present invention can be provided.
The flow diagrams depicted herein are just examples. There may be many variations to these diagrams or the steps (or operations) described therein without departing from the spirit of the invention. For instance, the steps may be performed in a differing order, or steps may be added, deleted or modified. All of these variations are considered a part of the claimed invention.
Although preferred embodiments have been depicted and described in detail herein, it will be apparent to those skilled in the relevant art that various modifications, additions, substitutions and the like can be made without departing from the spirit of the invention and these are therefore considered to be within the scope of the invention as defined in the following claims.
This application contains subject matter which is related to the subject matter of the following applications, each of which is assigned to the same assignee as this application and each of which is hereby incorporated herein by reference in its entirety: “Method, System and Program Product For Automatically Checking Coolant Loops Of A Cooling System For A Computing Environment,” Chu et al., (Docket No. POU920030164US1), Ser. No.______ , co-filed herewith; and “Scalable Coolant Conditioning Unit with Integral Plate Heat Exchanger/Expansion Tank and Method of Use,” Chu et al., Ser. No. 10/243,708, filed Sep. 13, 2002.