Available expansion card printed circuit board (PCB) real estate for voltage regulators, primarily on a front side of the expansion card, sets an upper limit of an amount of power that can be delivered to an integrated circuit (e.g., an accelerator) of the expansion card. Placing highly integrated voltage regulators and an associated cooling system on the back side of the accelerator card, underneath the processor, can increase the total power delivered to accelerator by at least 50%. In addition, the back side power delivery provides significantly reduced power path resistance and power delivery network (PDN) impedance between voltage regulators and the processor. These improvements increase power conversion efficiency, and hence achieve higher throughput power while reducing the PDN noise.
Efficient cooling of the power delivery components on the back side of the expansion card PCB facilitates the power architecture described above. In order to meet open compute project (OCP) accelerator module (OAM) form factor requirements, both the back side power delivery components and cooling solution should fit within an eight millimeter vertical space.
U.S. patent application Ser. No. 18/162,666, filed Jan. 31, 2023 and entitled “Systems and Methods for Cooling an Apparatus having Back side Power Delivery Components” discloses an expansion card having both top and bottom cooling systems with cooling fluid routed between the cooling systems using holes through a PCB of the expansion card. The disclosure of the aforementioned application is incorporated by reference herein in its entirety.
The accompanying drawings illustrate a number of example embodiments and are a part of the specification. Together with the following description, these drawings demonstrate and explain various principles of the present disclosure.
Throughout the drawings, identical reference characters and descriptions indicate similar, but not necessarily identical, elements. While the example embodiments described herein are susceptible to various modifications and alternative forms, specific embodiments have been shown by way of example in the drawings and will be described in detail herein. However, the example embodiments described herein are not intended to be limited to the particular forms disclosed. Rather, the present disclosure covers all modifications, equivalents, and alternatives falling within the scope of the appended claims.
The present disclosure is generally directed to systems and methods for server level cooling. For example, the disclosed systems and methods can provide a shared server level cooling mechanism that enables highly integrated power delivery components to be placed on the back side of the accelerator card, underneath the processor, and can enable increased total board power (e.g., at least fourteen hundred watts). The cooling system can be directly attached to a server's printed circuit board (PCB) (e.g., Universal Base Board (UBB)), thereby improving fluid routing and serviceability by allowing easy replacement of accelerators in service. The highly integrated nature of this cooling system can also allow for increased power density and meeting of the power delivery requirements of high-performance accelerators.
The disclosed systems and methods can achieve numerous benefits. For example, the disclosed systems and methods can enable power delivery of at least fourteen hundred watts in an OAM form factor, which represents the highest power density in the industry and corresponds to delivery of more than two-thousand five-hundred amps. In other examples, the disclosed systems and methods can also achieve a lower cost thermal solution and/or ease of assembly and serviceability in operation. In still other examples, the disclosed systems and methods can further achieve reduced PCB copper planes and numbers of layers that can yield a lower PCB cost for the expansion cards, reduced power path resistance between voltage regulators and processors, lower PCB copper losses, increased conversion efficiency, increased useful throughput power, reduced PDN impedance between voltage regulators and processors, and/or lower PDN noise.
In one example, an apparatus includes a printed circuit board and a cooling system attached to the printed circuit board, wherein the cooling system is configured for placement thereon of two or more expansion cards having back side power delivery components.
Another example can be the previously described example apparatus, further including a thermal interface material positioned on a cooling element of the cooling system.
Another example can be any of the previously described example apparatuses, wherein a cooling element of the cooling system has a thickness of no more than half of an amount of available clearance beneath a printed circuit board of the two or more expansion cards in conformance with open compute project accelerator module form factor requirements.
Another example can be any of the previously described example apparatuses, wherein a combined thickness of the cooling system, the back side power delivery components, and a thermal interface material positioned between a cooling element of the cooling system and the back side power delivery components is no greater than an amount of available clearance beneath a printed circuit board of the two or more expansion cards in conformance with open compute project accelerator module form factor requirements.
Another example can be any of the previously described example apparatuses, wherein the cooling system is operable independently of an additional cooling system located on a front side of at least one of the two or more expansion cards.
Another example can be any of the previously described example apparatuses, wherein a temperature of a fluid deliverable through a cooling element of the cooling system is controllable independently of an additional temperature of an additional fluid deliverable through an additional cooling element of the additional cooling system.
Another example can be any of the previously described example apparatuses, wherein a flow rate of a fluid deliverable through a cooling element of the cooling system is controllable independently of an additional flow rate of an additional fluid deliverable through an additional cooling element of the additional cooling system.
Another example can be any of the previously described example apparatuses, wherein the cooling system is configured to facilitate swappable replacement of the two or more expansion cards without requiring modification of the cooling system.
Another example can be any of the previously described example apparatuses, wherein a cooling element of the cooling system has at least one fluid inlet, at least one fluid outlet, and at least one fluid routing component.
Another example can be any of the previously described example apparatuses, wherein the at least one fluid routing component includes at least one of one or more cooling channels, one or more pin fins, one or more serpentine channels, or one or more heat transfer enhancement structures.
Another example can be any of the previously described example apparatuses, wherein a combined power delivery of the back side power delivery components and front side power delivery components of an individual expansion card of the two or more expansion cards is at least fourteen-hundred watts.
Another example can be any of the previously described example apparatuses, wherein the printed circuit board is configured as a universal base board to which the cooling system is directly attached.
In one example, a system includes a universal base board, a cooling system directly attached to the universal base board, and two or more expansion cards having back side power delivery components placed on a cooling element of the cooling system with a thermal interface material positioned between the cooling element and the back side power delivery components.
Another example can be the previously described example system, wherein a combined thickness of the cooling system, the back side power delivery components, and the thermal interface material is no greater than an amount of available clearance beneath a printed circuit board of the two or more expansion cards in conformance with open compute project accelerator module form factor requirements.
Another example can be any of the previously described example systems, wherein a combined power delivery of the back side power delivery components and front side power delivery components of an individual expansion card of the two or more expansion cards is at least fourteen-hundred watts.
Another example can be any of the previously described example systems, wherein the cooling system is configured to facilitate swappable replacement of the two or more expansion cards without requiring modification of the cooling system.
In one example, a method can include providing a printed circuit board and attaching a cooling system to the printed circuit board, wherein the cooling system is configured for placement thereon of two or more expansion cards having back side power delivery components.
Another example can be the previously described example method, wherein a combined thickness of the cooling system, the back side power delivery components, and a thermal interface material positioned between a cooling element of the cooling system and the back side power delivery components is no greater than an amount of available clearance beneath a printed circuit board of the two or more expansion cards in conformance with open compute project accelerator module form factor requirements.
Another example can be any of the previously described example methods, wherein a combined power delivery of the back side power delivery components and front side power delivery components of an individual expansion card of the two or more expansion cards is at least fourteen-hundred watts.
Another example can be any of the previously described example methods, wherein the cooling system is configured to facilitate swappable replacement of the two or more expansion cards without requiring modification of the cooling system.
The following will provide, with reference to
As illustrated in
The term “printed circuit board,” as used herein, can generally refer to a medium used in electrical and electronic engineering to connect electronic components to one another in a controlled manner. For example, and without limitation, a printed circuit board (PCB) can take the form of a laminated sandwich structure of conductive and insulating layers, with each of the conductive layers being designed with an artwork pattern of traces, planes, and other features (e.g., like wires on a flat surface) etched from one or more sheet layers of copper laminated onto and/or between sheet layers of a non-conductive substrate. Electrical components can be fixed to conductive pads on the outer layers in the shape designed to accept the component's terminals, generally by means of soldering, to both electrically connect and mechanically fasten them to it. Another manufacturing process can add vias, such as plated-through holes that allow interconnections between layers. PCBs can be single-sided (e.g., one copper layer), double-sided (e.g., two copper layers on both sides of one substrate layer), or multi-layer (e.g., outer and inner layers of copper, alternating with layers of substrate). Multi-layer PCBs allow for much higher component density because circuit traces on the inner layers would otherwise take up surface space between components.
The systems described herein can perform step 102 in a variety of ways. In one example, step 102 can include providing a printed circuit board configured as a universal base board. In some of these examples, the printed circuit board can be configured as a universal base board of a server. In some examples, step 102 can include providing a prefabricated printed circuit board. Alternatively, step 102 can include fabricating the printed circuit board.
The term “universal base board,” (UBB) as used herein, can generally refer to a printed circuit board conforming to UBB design specifications. For example, and without limitation, a UBB can provide modular and flexible support for open compute project (OCP) accelerator modules (OAMs) and provide design flexibility for conceivable system designs. For example, a UBB can support multiple OAM modules (e.g., four (4P) or eight (8P)), but be engineered to facilitate comprehensive options of interconnecting fabrics and topologies, power domains, thermal design powers (TDPs), cooling solutions, and scale-out options. While some UBB designs can be optimized for a few standard configurations and released OAM modules, other UBB designs can also accommodate future trends and customer needs.
As illustrated in
The term “cooling system,” as used herein, can generally refer to passive or active systems that are designed to regulate and dissipate the heat generated by a computer to maintain optimal performance and protect the computer from damage that will occur from overheating. For example, and without limitation, example cooling systems include one or more cold plates and/or one or more heat pipes. Cooling systems can also include thermal interface material that goes into joints to fill air gaps between solid surfaces during assembly. Thermal interface material can correspond to, be combined with, and/or include one or more heat spreaders that have high thermal conductivity and can be used as a bridge between a heat source and a heat exchanger.
The term “expansion card,” as used herein, can generally refer to a printed circuit board that can be inserted into an electrical connector, or expansion slot (e.g., bus slot) on a computer's motherboard (e.g., backplane) to add functionality to a computer system. For example, and without limitation, expansion card can be an expansion board, an adapter card, a peripheral card, and/or an accessory card. Sometimes the design of the computer's case and motherboard involves placing most or all of the expansion slots onto a separate, removable card. Typically, such cards are referred to as riser cards in part because they project upward from the board and allow expansion cards to be placed above and parallel to the motherboard. Various standards define requirements for expansion cards, including power delivery requirements and form factors. One such standard corresponds to open compute project (OCP) accelerator module (OAM) for graphics accelerator cards.
The term “power delivery components,” as used herein, can generally refer to an electricity regulation device. For example, and without limitation, power delivery component can refer to one or more voltage regulators. A voltage regulator is a system designed to automatically maintain a constant voltage. A voltage regulator can use a simple feed-forward design or include negative feedback. It can use an electromechanical mechanism or electronic components. Depending on the design, it can be used to regulate one or more alternating current (AC) or direct current (DC) voltages. In this context, the term “back side power delivery components” can generally refer to power delivery components located on a back side of an expansion card having an integrated circuit (e.g., processor, accelerator, etc.) located on a front side of the expansion card.
The systems described herein can perform step 104 in a variety of ways. In one example, step 104 can include directly attaching the cooling system to a PCB configured as a universal baseboard. In one example, step 104 can include positioning a thermal interface material on a cooling element of the cooling system. In another example, step 104 can include attaching a cooling system having a cooling element that has a thickness of no more than half of an amount (e.g., four millimeters) of available clearance (e.g., eight millimeters) beneath the PCB 202 in conformance with open compute project (OCP) accelerator module (OAM) form factor requirements. In some of these examples, a combined thickness of the cooling system, the back side power delivery components, and a thermal interface material positioned between a cooling element of the cooling system and the back side power delivery components is no greater than an amount of available clearance (e.g., eight millimeters) beneath the PCB 202 in conformance with OAM form factor requirements. In another example, step 104 can include attaching a cooling system that is operable independently of an additional cooling system located on a front side of at least one of the two or more expansion cards. In some of these examples, a temperature of a fluid deliverable through a cooling element of the cooling system can be controllable independently of an additional temperature of an additional fluid deliverable through an additional cooling element of the additional cooling system. Alternatively or additionally, a flow rate of a fluid deliverable through a cooling element of the cooling system can be controllable independently of an additional flow rate of an additional fluid deliverable through an additional cooling element of the additional cooling system. In another example, step 104 can include attaching a cooling system that facilitates a combined power delivery of the back side power delivery components and front side power delivery components of an individual expansion card of the two or more expansion cards being at least fourteen-hundred watts. In another example, step 104 can include attaching a cooling system configured to facilitate swappable replacement of the two or more expansion cards without requiring modification of the cooling system. In another example, step 104 can include attaching a cooling system having a cooling element of the cooling system has at least one fluid inlet, at least one fluid outlet, and at least one fluid routing component. In some of these examples, the at least one fluid routing component can include one or more cooling channels, one or more pin fins, one or more serpentine channels, and/or one or more heat transfer enhancement structures.
The shorter power delivery path that extends between the IC 204 and the back side PDCs 208A and 208B yield numerous benefits. For example, the shorter path results in reduced PCB copper planes and a reduced number of layers for reduced PCB cost. Additionally, the shorter path results in reduced power path resistance between the PDCs 208A and 208B and the IC 204 for reduced PCB copper losses. Also, the shorter path results in reduced PDN impedance from the PDCs 208A and 208B to the IC 204. Further, the shorter path results in reduced PDN noise and increased conversion efficiency for increased useful throughput power. In some examples, the PDCs 206A, 206B, 208A, and 208B can, in combination, provide a combined power delivery of at least fourteen-hundred watts.
A challenge in implementing the expansion card 200 having the features described above arises in cooling the back side PDCs 208A and 208B. Complying with OAM standards can entail achieving this goal while fitting within an amount of available clearance (e.g., eight millimeters) beneath the PCB 202 in conformance with OAM form factor requirements, especially where the PDCs 208A and 208B already consume approximately three millimeters of the available clearance.
Cooling element 304 can extend beyond one or more dimensions (e.g., length and/or width) of an expansion card and be large enough in area for placement thereon of two or more (e.g., two, four, or eight) expansion cards. In some implementations, back side thermal interface material 406 can be sized and shaped for positioning between the cooling element 304 and the back side PDCs 208A and 208B of more than one expansion card. Alternatively, cooling element 304 can be provided with more than one piece of back side thermal interface material 406, and at least one of the pieces of back side thermal interface material 406 can be sized and shaped for positioning between the cooling element 304 and back side PDCs 208A and 208B of only one expansion card. Multiple pieces of back side thermal interface materials 406 (e.g., one per expansion card) can facilitate replacement of a particular piece of back side thermal interface material 406 when replacing an expansion card, without having to remove any other expansion cards in the process.
The back side cooling system including back side cooling element 504 can be operable independently of one or more front side cooling systems located on front sides of individual expansion cards 506. For example, temperature and/or flow rate of a fluid deliverable through the back side cooling element 504 can be controllable independently of an additional temperature and/or an additional flow rate of an additional fluid deliverable through a front side cooling element of the front side cooling system. In some examples, the front side cooling systems of the individual expansion cards 506 can further be operable independently of one another. Due to independent operation of the back side cooling system and the front side cooling systems, there is no need for fluid routing components between the front side and back side cooling systems. As a result, individual expansion cards 506 can be replaced without any need to modify the back side cooling system. Additionally, the independent operation and can allow for a larger area of the back side cooling element 504 and greater volume of fluid compared to individual back side cooling systems, thus providing more effective cooling that facilitates increased total power delivery (e.g., at least fourteen hundred watts) to one or more expansion cards 506.
In some implementations and as set forth above, apparatus 500 can be configured as a shared server level 4P cooling system that enables highly integrated power delivery components to be placed on the back side of an accelerator card, underneath the processor, and can enable total board power of at least fourteen hundred watts. This cooling solution can be directly attached to the UBB, thereby improving fluid routing and serviceability by allowing easy replacement of accelerators in service. The highly integrated nature of this solution allows for increased power density and meeting the power delivery requirements of high-performance accelerators.
The above described server level implementation can achieve independent fluid circuits for the bottom and top cooling solutions. The independent fluid circuits can allow independent control of the back side cooling system by controlling the flow rate and/or temperature and make it possible to increase power delivery. This implementation can also simplify the assembly of the UBB and drastically improve serviceability by allowing replacement of OAMs without having to modify the back side cooling system components. This 4P cooling system can also reduce the cost of the thermal solution when compared to four individual cold plates for each OAM.
In order to meet OAM form factor design specifications, the back side cooling system can use a liquid-cooled cold plate having a thickness no greater than half of an amount (e.g., four millimeters) of available clearance (e.g., eight millimeters) beneath PCBs of individual expansion cards 506 in conformance with OAM form factor requirements. The cold plate can include cooling channels, pin fins, serpentine channels, and/or other heat transfer enhancement structures. The back side cooling system can be directly attached to the UBB 502 and a thermal interface material can be used to make contact with the power components. As shown in
The additional back side cooling system including additional back side cooling element 602 can be operable independently of one or more front side cooling systems located on front sides of individual expansion cards 506. For example, temperature and/or flow rate of a fluid deliverable through the additional back side cooling element 602 can be controllable independently of an additional temperature and/or an additional flow rate of an additional fluid deliverable through an additional cooling element of the front side cooling system. In some examples, the front side cooling systems of the individual expansion cards 506 can further be operable independently of one another. Due to independent operation of the additional back side cooling system and the front side cooling systems, there is no need for fluid routing components between the front side and additional back side cooling systems. As a result, individual expansion cards 506 can be replaced without any need to modify the additional back side cooling system. Additionally, the independent operation can allow for a larger area of the additional back side cooling element 602 and greater volume of fluid compared to the front side cooling systems, thus providing more effective cooling that facilitates increased total power delivery (e.g., at least fourteen hundred watts) to one or more expansion cards 506.
In some implementations and as set forth above, apparatus 600 can be configured as a shared server level 8P cooling system that enables highly integrated power delivery components to be placed on the back side of an accelerator card, underneath the processor, and can enable increased total board power. This cooling solution can be directly attached to the UBB, thereby improving fluid routing and serviceability by allowing easy replacement of accelerators in service. The highly integrated nature of this solution allows for increased power density and meeting the power delivery requirements of high-performance accelerators. Back side cooling elements 504 and 602 can also both be liquid-cooled cold plates having thicknesses no greater than half an amount (e.g., four millimeters) of available clearance (e.g., eight millimeters) beneath PCBs of individual expansion cards 506 in conformance with OAM form factor requirements and an independent fluid circuit. This cooling element thickness and fluid circuit independence can facilitate independent control of cooling systems that meet OAM form factor design specifications.
The back side cooling system including back side cooling element 702 can be operable independently of one or more front side cooling systems located on front sides of individual expansion cards 506. For example, temperature and/or flow rate of a fluid deliverable through the back side cooling element 702 can be controllable independently of an additional temperature and/or an additional flow rate of an additional fluid deliverable through a front side cooling element of the front side cooling system. In some examples, the front side cooling systems of the individual expansion cards 506 can further be operable independently of one another. Due to independent operation of the back side cooling system and the front side cooling systems, there is no need for fluid routing components between the front side and back side cooling systems. As a result, individual expansion cards 506 can be replaced without any need to modify the back side cooling system. Additionally, the independent operation can allow for a larger area of the back side cooling element 702 and greater volume of fluid compared to the front side cooling systems, thus providing more effective cooling that facilitates increased total power delivery (e.g., at least fourteen hundred watts) to one or more expansion cards 506.
In some implementations and as set forth above, apparatus 700 can be configured as a shared server level 8P cooling system that enables highly integrated power delivery components to be placed on the back side of an accelerator card, underneath the processor, and can enable increased total board power. This cooling solution can be directly attached to the UBB, thereby improving fluid routing and serviceability by allowing easy replacement of accelerators in service. The highly integrated nature of this solution allows for increased power density and meeting the power delivery requirements of high-performance accelerators. Back side cooling element 702 can also be a liquid-cooled cold plate having a thickness no greater than half an amount (e.g., four millimeters) of available clearance (e.g., eight millimeters) beneath PCBs of individual expansion cards 506 in conformance with OAM form factor requirements and an independent fluid circuit. This cooling element thickness and fluid circuit independence facilitates independent control of a back side cooling system that meets OAM form factor design specifications.
As set forth above, the disclosed systems and methods can provide a shared server level cooling mechanism that enables highly integrated power delivery components to be placed on the back side of the accelerator card, underneath the processor, and can enable increased total board power (e.g., at least fourteen hundred watts). The cooling system can be directly attached to a server's printed circuit board (PCB) (e.g., Universal Base Board (UBB)), thereby improving fluid routing and serviceability by allowing easy replacement of accelerators in service. The highly integrated nature of this cooling system can also allow for increased power density and meeting of the power delivery requirements of high-performance accelerators.
The disclosed systems and methods can achieve numerous benefits. For example, the disclosed systems and methods can enable power delivery of at least fourteen hundred watts in an OAM form factor, which represents the highest power density in the industry and corresponds to delivery of more than two-thousand five-hundred amps. In other examples, the disclosed systems and methods can also achieve a lower cost thermal solution and/or ease of assembly and serviceability in operation. In still other examples, the disclosed systems and methods can further achieve reduced PCB copper planes and numbers of layers that can yield a lower PCB cost for the expansion cards, reduced power path resistance between voltage regulators and processors, lower PCB copper losses, increased conversion efficiency, increased useful throughput power, reduced PDN impedance between voltage regulators and processors, and/or lower PDN noise.
While the foregoing disclosure sets forth various implementations using specific block diagrams, flowcharts, and examples, each block diagram component, flowchart step, operation, and/or component described and/or illustrated herein can be implemented, individually and/or collectively, using a wide range of hardware, software, or firmware (or any combination thereof) configurations. In addition, any disclosure of components contained within other components should be considered example in nature since many other architectures can be implemented to achieve the same functionality.
In some examples, all or a portion of example system 100 in
In various implementations, all or a portion of example system 100 in
According to various implementations, all or a portion of example system 100 in
In some examples, all or a portion of example system 100 in
The process parameters and sequence of steps described and/or illustrated herein are given by way of example only and can be varied as desired. For example, while the steps illustrated and/or described herein can be shown or discussed in a particular order, these steps do not necessarily need to be performed in the order illustrated or discussed. The various example methods described and/or illustrated herein can also omit one or more of the steps described or illustrated herein or include additional steps in addition to those disclosed.
While various implementations have been described and/or illustrated herein in the context of fully functional computing systems, one or more of these example implementations can be distributed as a program product in a variety of forms, regardless of the particular type of computer-readable media used to actually carry out the distribution. The implementations disclosed herein can also be implemented using modules that perform certain tasks. These modules can include script, batch, or other executable files that can be stored on a computer-readable storage medium or in a computing system. In some implementations, these modules can configure a computing system to perform one or more of the example implementations disclosed herein.
The preceding description has been provided to enable others skilled in the art to best utilize various aspects of the example implementations disclosed herein. This example description is not intended to be exhaustive or to be limited to any precise form disclosed. Many modifications and variations are possible without departing from the spirit and scope of the present disclosure. The implementations disclosed herein should be considered in all respects illustrative and not restrictive. Reference should be made to the appended claims and their equivalents in determining the scope of the present disclosure.
Unless otherwise noted, the terms “connected to” and “coupled to” (and their derivatives), as used in the specification and claims, are to be construed as permitting both direct and indirect (i.e., via other elements or components) connection. In addition, the terms “a” or “an,” as used in the specification and claims, are to be construed as meaning “at least one of.” Finally, for ease of use, the terms “including” and “having” (and their derivatives), as used in the specification and claims, are interchangeable with and have the same meaning as the word “comprising.”