This application claims the benefit of European Patent Application Number EP 10161401.4, filed on Apr. 29, 2010, which is incorporated herein by reference in its entirety.
1. Field of the Invention
The present invention relates to computer control of a cooling system, and more specifically, to the control of coolant flow to multiple cooling units in a computer system.
2. Description of the Related Art
U.S. Patent Publication No. 2009/0265568 A1 describes a method wherein data about a current state of a compute environment is received and a workload that is currently consuming resources in the compute environment is analyzed. The use of at least one resource in the compute environment is modified in a manner related to energy consumption based on the received data and analysis of the workload.
U.S. Patent Publication No. 2003/0193777 A1 describes an energy management system for one or more computer data centers, including a plurality of racks containing electronic packages. The energy management system includes a system controller for distributing workload among the electronic packages. The system controller is also configured to manipulate cooling systems within the one or more data centers. Cooling systems often incur greater amounts of operating expenses to sufficiently cool the heat generating components contained in the racks of the data centers.
In certain aspects, operating a computer system includes determining, for each of a plurality of electronic units in a computer system, at least one upcoming process being assigned to a specific electronic unit. An anticipated workload for the specific electronic unit is identified based upon the at least one upcoming process. At least one control signal is generated based upon a plurality of anticipated workloads for the plurality of electronic units. A flow of cooling fluid to each of a plurality of cooling units in the computer system is controlled based upon the at least one control signal. Each of the plurality of cooling units are respectively associated with an electronic unit of the plurality of electronic units.
In further aspects, operating a computer system includes obtaining, for each of a plurality of electronic units in a computer system, an anticipated workload. At least one control signal is generated based upon a plurality of anticipated workloads for the plurality of electronic units. A flow of cooling fluid to each of a plurality of cooling units in the computer system is controlled based upon the at least one control signal. Each of the plurality of cooling units are respectively associated with an electronic unit of the plurality of electronic units.
In still further aspects, a computer system comprises a plurality of electronic units; a plurality of cooling units, a downstream positioned common outlet portion; an upstream positioned common inlet portion; and a controller. Each of the plurality of cooling units are respectively associated with one of the plurality of electronic units. Each of the plurality of cooling units is configured to receive a stream of cooling fluid and includes an input port hydraulically coupled with the upstream positioned common inlet portion, and an output port hydraulically coupled with the downstream positioned common outlet portion. The controller is configured to determine, for each of the plurality of electronic units, at least one upcoming process being assigned to a specific electronic unit; identify an anticipated workload for the specific electronic unit based upon the at least one upcoming process; generate at least one control signal based upon a plurality of anticipated workloads for the plurality of electronic units; and control, based upon the at least one control signal, a flow of the cooling fluid to each of the plurality of cooling units.
In yet further aspects, a computer system comprises a plurality of electronic units; a plurality of cooling units, a downstream positioned common outlet portion; an upstream positioned common inlet portion; and a controller. Each of the plurality of cooling units are respectively associated with one of the plurality of electronic units. Each of the plurality of cooling units is configured to receive a stream of cooling fluid and includes an input port hydraulically coupled with the upstream positioned common inlet portion, and an output port hydraulically coupled with the downstream positioned common outlet portion. The controller is configured to determine, for each of the plurality of electronic units, at least one upcoming process being assigned to a specific electronic unit; identify an anticipated workload for the specific electronic unit based upon the at least one upcoming process; generate at least one control signal based upon a plurality of anticipated workloads for the plurality of electronic units; and control, based upon the at least one control signal, a flow of the cooling fluid to each of the plurality of cooling units.
As will be appreciated by one skilled in the art, aspects of the present invention may be embodied as a system, method or computer program product. Accordingly, aspects of the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment (including firmware, resident software, micro-code, etc.) or an embodiment combining software and hardware aspects that may all generally be referred to herein as a “circuit,” “module,” or “system.” Furthermore, aspects of the present invention may take the form of a computer program product embodied in one or more computer readable medium(s) having computer readable program code embodied, e.g., stored, thereon.
Any combination of one or more computer readable medium(s) may be utilized. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or Flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain (or store) a program for use by or in connection with an instruction execution system, apparatus, or device.
A computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber, cable, RF, etc., or any suitable combination of the foregoing. Computer program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C++ or the like and conventional procedural programming languages, such as the “C” programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer, or entirely on the remote computer or server. In the latter scenario, the remote computer may be connected to the user's computer through any type of network, including a local area network (LAN) or a wide area network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet Service Provider).
Aspects of the present invention are described below with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. Each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented using computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer, other programmable data processing apparatus, or other devices create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer readable medium that can direct a computer, other programmable data processing apparatus, or other devices to function in a particular manner, such that the instructions stored in the computer readable medium produce an article of manufacture including instructions which implement the function/act specified in the flowchart and/or block diagram block or blocks.
The computer program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatus or other devices to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide processes for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.
An operating system is typically executed by one or more electronic units EU1-EU3 of the computer system CS. The operating system may implement a kernel included as part of the kernel-layer K. The kernel is operable to organize process and memory management tasks within the operating system and typically features direct accesses to hardware resources including as part of the hardware-layer HW. The user-space layer US is typically set up on the kernel-layer K. User programs are typically included as part of the user-space layer US. The user programs are operable to invoke kernel functions to indirectly access hardware components, for example, memory units or networking devices. Each user program is executed within the scope of one or more processes.
A scheduler SCH, for example, an operating system scheduler, may be included as part of the kernel of the operating system. Scheduling refers to the way processes are assigned to run on the available electronic units EU1-EU3, since there are typically many more processes running than there are available electronic units EU1-EU3. This assignment is typically carried out by the scheduler SCH.
A flow controller FC and a flow controller interface FCIC may, for example, be implemented as part of the operating system, in particular, as part of the scheduler SCH. Alternatively, the flow controller FC may be implemented as a separate controller, for example, on a main board or as an extension of a system management module (
By way of example, the computer system CS comprises a first, second, and third cooling units CU1, CU2, CU3. In certain aspects, the amount of cooling units CU1-CU3 corresponds to the amount of electronic units EU1-EU3 which need to be cooled. Each cooling unit CU1-CU3 may comprise a cooling plate or cold plate or is implemented as a cooling plate or cold plate. Each cooling unit CU1-CU3 is supplied with a cooling fluid. An input port IP of each cooling unit CU1-CU3 is hydraulically coupled with an upstream positioned common inlet portion CIP. An output port OP of each cooling unit CU1-CU3 is hydraulically coupled with a downstream positioned common outlet portion COP. Each cooling unit CU1-CU3 is thermally coupled with a particular electronic unit EU1-EU3 of a predetermined selection of electronic units EU1-EU3 and is operable to dissipate heat produced by the associated electronic unit EU1-EU3 to the cooling fluid. In certain aspects, only those electronic units EU1-EU3, which typically dissipate heat during operation, are included as part of the predetermined selection of electronic units.
In certain aspects, the cooling fluid is a cooling liquid, for example, water. Water is typically capable of capturing heat about 4,000 times more efficiently than air because the volumetric heat capacity of water is about 4,000 times larger than that of air. Water provided at the common inlet portion CIP with an input temperature value of, for example, up to 60° C. can keep the chilled electronic units EU1-EU3 at operating temperatures well below a maximum allowed temperature value of, for example, 85° C.
A first valve V1 is included as part of the first cooling unit CU1, a second valve V2 is included as part of the second cooling unit CU2, and a third valve V3 is included as part of the third cooling unit CU3. In certain aspects, the particular valve V1-V3 is upstream positioned to the input port IP of the associated cooling unit CU1-CU3. Alternatively, the particular valve V1-V3 is downstream positioned to the output port OP of the respective cooling unit CU1-CU3. Each valve V1-V3 is operable to control, for example, throttle or adjust, a flow of the cooling fluid, in particular, a volume flow rate of the cooling fluid, to the respective cooling unit CU1-CU3. In certain aspects, the amount of valves corresponds to the amount of cooling units.
Each cooling unit CU1-CU3 may, for example, obtain a particular control signal S_C1-S_C3 provided by the flow controller FC (
As illustrated in
A method for operating the computer system CS is illustrated using the flow chart in
The method may start at step S0. In step S2, for each electronic unit EU1-EU3 of the predetermined selection of electronic units EU1-EU3, upcoming processes are determined as being assigned to a particular electronic unit EU1-EU3. With regard to the following discussion, the method is described with focus on the first electronic unit EU1. The remaining electronic units EU2, EU3 of the computer system CS are processed correspondingly. As shown in
In step S4, for each electronic unit EU1-EU3 of the predetermined selection of electronic units EU1-EU3, a workload is anticipated based on the determined upcoming processes P0-P5 being assigned to the particular electronic unit EU1-EU3. With focus on the first electronic unit EU1, the workload WL1 being associated to the first electronic unit EU1 is anticipated based upon the three upcoming processes P0-P2 determined in step S2.
Step S4 may, for example, be executed by the scheduler SCH of the kernel of the operating system. The anticipated workloads of the electronic units EU1-EU3 of the predetermined selection of electronic units EU1-EU3 may be characterized as workload data.
Alternatively, step S4 may be executed by the flow controller FC. For this purpose, a process signal S_P may be provided by the scheduler SCH via the flow controller interface FCIC (
In step S6, the at least one control signal S_C1-S_C3 is determined based on the anticipated workloads being associated to electronic units EU1-EU3 and the workload data, respectively. With focus on the first electronic unit EU1, the particular control signal S_C1 may be provided based on the workload WL1 being anticipated in step S4. Alternatively, a single control signal may be provided to all available valves V1-V3.
Step S6 may, for example, be executed by the flow controller FC. Based on the workload data being determined in step S4, the flow controller FC may determine a separate control signal S_C1-S_C3 for each valve V1-V3, or the flow controller FC may determine a single control signal which may be provided to all available valves.
In step S8, the at least one control signal S_C1-S_C3 is provided to the associated valves V1-V3 to control the flow of cooling fluid to the particular cooling units CU1-CU3. With focus on the first electronic unit EU1, the flow of the cooling fluid to the first cooling unit CU1 is controlled based on the at least one control signal S_C1-S_C3 being provided to the first valve V1.
Fast changes in the workload of the electronic units EU1-EU3 are usually related to process or thread creation. Isolated process creation and scheduler round trip times are typically about microseconds, as for example 10 ms. In system context (including hardware and operating system), the required total time however increases. It typically takes a couple of milliseconds from process creation to an operational process P0-P5. During this time, a page fault is created which triggers a memory management of the operating system to make a process memory access to load necessary pages from a memory unit into a main memory of the computer system CS for execution. In other words, in a system context, a process creation takes typically a few milliseconds. By way of example, this period of time may be used to provide the at least one control signal S_C1-S_C3 and to control the flow of the cooling fluid to the particular cooling unit CU1-CU3. When the process starts to consume a lot of processing cycles (and concomitantly additional heat needs to be collected by the cooling fluid) an increased and accurately anticipated flow of the cooling fluid may be provided.
In case an existing process is spontaneously consuming more processing cycles (and concomitantly additional heat needs to be collected by the cooling fluid), the scheduler SCH and/or flow controller FC may react in-between two time slices of the process. In a normal running operating system, there may be hundred or more concurrently active processes. Therefore, there is enough time to anticipate and accurately control the flow of the cooling fluid.
The method illustrated in the flow chart terminates in step S14. Alternatively, the method may restart in step S2.
In parallel (or subsequent) to step S4, a value of the output temperature T_COP of the cooling fluid at the common outlet portion COP may be determined in step S10, for example, via the output temperature sensor T. The output temperature value T_COP may be additionally considered in the determination of the control signals S_C1-S_C3 in step S6 (
In step S12, the output temperature value T_COP is compared with a predetermined temperature threshold value T_TH, for example, 60° C. If the output temperature value T_COP is below the temperature threshold value T_TH, a temperature signal S_T may be provided, which represents a lower limit signal (
The temperature signal S_T may be provided to the scheduler SCH via the flow controller interface FCIC. Based on the temperature signal S_T, the scheduler SCH (step S2) may be operable to redistribute the upcoming processes to a group of electronic units EU1-EU3 of the predetermined selection of electronic units EU1-EU3 in a manner such that the resulting workload for each electronic unit EU1-EU3 of the group of electronic units is above a predetermined workload threshold value, for example, 90%. However, in certain aspects, the flow of the cooling fluid to the electronic units EU1-EU3 of the group of electronic units EU1-EU3 is still controlled in a manner that the output temperature value T_COP of the cooling fluid at the common outlet portion COP is within the predetermined temperature value range.
The flow controller FC may additionally be operable to control the valves V1-V3 based on the measured output port temperature values provided by the output port temperature sensors at the output ports OP of the cooling units CU1-CU3. In particular, in case of a fault of the operating system, the cooling of the electronic units EU1-EU3 is still feasible.
Each electronic unit EU1-EU3 may also comprise one or more memory devices and/or one or more power supplies and the like. With a memory device, the workload of the particular electronic unit may, for example, be represented by an amount of memory accesses and/or a data volume being required for executing the upcoming process which is assigned to the particular electronic unit. With a power supply, the workload of the particular electronic unit may, for example, be represented by a current being required for executing the upcoming process, which is assigned to the particular electronic unit.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. Each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems that perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used herein, the singular forms “a,” “an,” and “the” are intended to include the plural forms as well, unless the context clearly indicates otherwise. The terms “comprises” and/or “comprising,” when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
The corresponding structures, materials, acts, and equivalents of all means or step plus function elements in the claims below are intended to include any structure, material, or act for performing the function in combination with other claimed elements as specifically claimed. The description of the present invention has been presented for purposes of illustration and description, but is not intended to be exhaustive or limited to the invention in the form disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the invention. The embodiment was chosen and described in order to best explain the principles of the invention and the practical application, and to enable others of ordinary skill in the art to understand the invention for various embodiments with various modifications as are suited to the particular use contemplated.
Number | Date | Country | Kind |
---|---|---|---|
10161401.4 | Apr 2010 | EP | regional |