The technology of the disclosure relates generally to reducing voltage droop in an integrated circuit and, more particularly, to minimizing current spikes by controlling circuit switching.
To reduce the package sizes of technologies employed for high-performance processing capabilities, the number of processing circuits provided in an integrated circuit (IC) chip has continued to increase. Communication of data among the many processing circuits can create congestion in the IC chip. One approach to handling data processed by the many processing circuits is to employ a mesh network in which each of the processing circuits is coupled to a node of the network and data is passed from node to node over segments of the network. In the nodes, the number of circuits switching due to data transmissions in a given system clock cycle varies from node to node depending on the respective processing circuits. Thus, the power needs among the nodes can shift frequently and the power levels in regions of the IC chip can rise suddenly. In such situations, the demand for current on the power rail providing a power supply voltage to these regions increases suddenly. Capacitance of the power distribution network within the IC chip may discharge in response to a sudden current increase, causing a voltage level on the power supply rail to droop temporarily. To avoid having the power supply voltage on the power rail drop below a minimum voltage, below which the processing circuits may not continue to operate normally, the nominal voltage level maintained on the power rail may be constantly maintained at a higher level to provide a voltage margin. However, maintaining a higher nominal voltage level on the power rail increases the power consumption of the IC chip, which may cause heat-related problems and will reduce battery life in mobile devices. Circuits and methods for avoiding voltage droop in the nodes in a mesh network without simply increasing power supply voltage to the entire IC chip would save power and avoid excessive heat generation.
Aspects disclosed in the detailed description include configurable mesh network node aggregation for mitigating voltage droop in an integrated circuit (IC) chip. Related methods of configurably aggregating mesh network nodes to mitigate voltage droop are also disclosed. A sudden increase in the demand for current in a node in a mesh network on an IC chip, known as a di/dt event, can cause a droop in the power supply voltage in a power supply rail coupled to the node. This problem may occur in individual nodes or in regions of the IC chip due to data transmissions among multiple adjacent nodes on the mesh network. The IC chip may have multiple such regions, which can be identified through testing. Exemplary aggregation circuits provided in each of the nodes of the IC chip can be employed to, based on indications of power consumption in an aggregation zone of the IC chip, reduce power consumption in the nodes in the aggregation zone to mitigate voltage droop. In particular, each aggregation zone includes a first node (also referred to herein as a “leader” node) that receives indications of power consumption associated with the first node and indications of power consumption associated with each of the other nodes in the aggregation zone. The first node generates a control signal based on the received indications, and each of the plurality of nodes in the aggregation zone reduces power consumption based on the control signal. In some examples, the aggregation circuit in any node may be configured to operate in a first, leader mode or in a second, follower mode, providing flexibility in the configuration of aggregation zones. In some examples, the aggregation circuits in some nodes in an aggregation zone are configured in a third, middle mode that receives the indications of power consumption from nodes in the second, follower mode and provides the indications to the first node. In addition, in such examples, the nodes in the third, middle mode can receive the control signal from the first node and provide the first control signal to the nodes in the second, follower mode.
In this regard, an IC chip is disclosed. The IC chip includes a plurality of nodes in a mesh network. The IC chip further includes a first aggregation zone comprising a first node and at least a second node of the plurality of nodes wherein each node of the plurality of nodes comprises an aggregation circuit configured to receive a first indication of power consumption associated with the node, the aggregation circuit in the first node is configured to, in response to operating in a first mode: receive at least a second indication of power consumption associated with each of the at least a second node and provide a first control signal based on the first indication and the at least a second indication to each node of the at least a second node, and the aggregation circuit in each of the first node and the at least a second node is configured to reduce power consumption in the node in response to the first control signal.
In another aspect, a method in an IC chip is disclosed. The method includes in a first aggregation zone comprising a first node and at least a second node of the plurality of nodes in a mesh network receiving, in an aggregation circuit in each node of the plurality of nodes, a first indication of power consumption associated with the node. The method further includes in response to the aggregation circuit in the first node configured to operate in a first mode receiving, in the first node, at least a second indication of power consumption associated with each of the at least a second node and providing a first control signal based on the first indication and the at least a second indication to each node of the at least a second node and in the aggregation circuit in each of the first node and the at least a second node, reducing power consumption in the node in response to the first control signal.
With reference now to the drawing figures, several exemplary aspects of the present disclosure are described. The word “exemplary” is used herein to mean “serving as an example, instance, or illustration.” Any aspect described herein as “exemplary” is not necessarily to be construed as preferred or advantageous over other aspects.
Aspects disclosed in the detailed description include configurable mesh network node aggregation for mitigating voltage droop in an integrated circuit (IC) chip. Related methods of configurably aggregating mesh network nodes to mitigate voltage droop are also disclosed. A sudden increase in the demand for current in a node in a mesh network on an IC chip, known as a di/dt event, can cause a droop in the power supply voltage in a power supply rail coupled to the node. This problem may occur in individual nodes or in regions of the IC chip due to data transmissions among multiple adjacent nodes on the mesh network. The IC chip may have multiple such regions, which can be identified through testing. Exemplary aggregation circuits provided in each of the nodes of the IC chip can be employed, based on indications of power consumption in an aggregation zone of the IC chip, to reduce power consumption in the nodes in the aggregation zone to mitigate voltage droop. In particular, each aggregation zone includes a first node (also referred to herein as a “leader” node) that receives indications of power consumption associated with the first node and indications of power consumption associated with each of the other nodes in the aggregation zone. The first node generates a control signal based on the received indications, and each of the plurality of nodes in the aggregation zone reduces power consumption based on the control signal. In some examples, the aggregation circuit in any node may be configured to operate in a first, leader mode or in a second, follower mode, providing flexibility in the configuration of aggregation zones. In some examples, the aggregation circuits in some nodes in an aggregation zone are configured in a third, middle mode that receives the indications of power consumption from nodes in the second, follower mode and provides the indications to the first node. In addition, in such examples, the nodes in the third, middle mode can receive the control signal from the first node and provide the first control signal to the nodes in the second, follower mode.
Before describing exemplary aspects of an aggregation circuit 310 and 400 in
The IC chip 100 may be a system-on-chip (SOC) and may include many processing circuits (not shown here) that are each coupled to one of the plurality of nodes 102(1)-102(X) (referred to collectively as nodes 102) in a mesh network 104. The nodes 102 are coupled to each other by segments 106 of the mesh network 104. That is, each of the segments 106 is coupled between a first node 102(1) and a second node 102(2) that are adjacent to each other, for example. Each of the nodes 102(1)-102(X) is coupled to at least two segments of the mesh network 104. The segments 106 have a significant capacitance due to their length. Therefore, switching the data that is transmitted on the segments 106 in each cycle of a system clock CLK causes a significant amount of power to be consumed by driver circuits (not shown) that drive the data. The system clock CLK is employed to trigger the flow of data through sequential circuits and provide synchronization in the IC chip 100.
Consuming a significant amount of power in any of the nodes 102 in a short period of time (e.g., a high power consumption rate) imposes a demand for a high level of current to be provided to those nodes 102. As noted above, data transmissions on the segments 106 are a significant source of such power consumption. In circumstances in which there is a sudden increase from a low power consumption state, in which there may be infrequent data transmissions, to a higher power consumption state, where data transmissions are occurring in every cycle or almost every cycle, there may be a sudden increase in the current demanded from a power rail that provides power to the nodes 102 transmitting data. A sudden increase in current (i) in a short period of time (t), known as a di/dt event, in a node 102 or group of nodes 102 in a region of the IC chip 100 may discharge capacitance of a power distribution network in the region of the IC chip 100 or within the entire IC chip 100 before a power management circuit (not shown) external to the IC chip 100 can respond and provide power at a higher rate.
As a result of the discharge of capacitance in the power distribution network, the power supply voltage on the power rail may suddenly drop when the available charge is consumed. When voltage in the processing circuits drops below a minimum threshold, the processing circuits and data drivers may not operate as expected, which can cause malfunctions in the IC chip 100. Such a drop in the power supply voltage is known as a voltage droop and is discussed in more detail with reference to
A power management chip or voltage regulator (not shown) may be coupled to the IC package substrate for providing power to the IC chip 100. The power management chip is typically located farther from the IC chip 100 than the first-level power capacitors. Adjacent to the power management chip, the power distribution network providing power to the IC chip 100 includes one or more second-level power capacitor(s) having even greater capacitance than the first-level power capacitor(s). The second-level power capacitor(s) may also discharge before the second-level power capacitors coupled to the power management chip can provide charge to meet the current level demanded in the IC chip 100. The transition from the discharge of the second-level power capacitor(s) to a stabilization voltage VST of the power supply voltage VPS is shown in
As shown in the example in
The processing circuits 306 may include any kind of processor, processor core, and/or data storage circuits (e.g., cache memories or register files). The processing devices may quickly process large amounts of data based on sequences of instructions. The instructions and raw data for processing are transferred into the node 300 from another node before being provided to the processing circuits 306, and the data processed by the processing circuits 306 is transferred from the node 300 to another node over the mesh network 304 for further processing or storage. Thus, processing activity and events in the processing circuits 306 cause traffic (data transfers) on the segments 302 coupled to the node 300. In addition to all the traffic due to the processing circuits 306, data may also be transferred through the node 300 (e.g., node 102(2) in
In this regard, power consumption in and around the node 300 may be primarily due to the processing circuits 306 as well as traffic through the node 300. In particular, a large portion of the power consumption associated with the node 300 may be due to data transmissions (e.g., data egresses) on the segments 302 from the node 300 because of the power required to charge the long wires of the segments 302 from the node 300 to an adjacent node (see
The aggregation circuit 400 includes a plurality of inputs 402 and outputs 404 that are described with reference to the respective modes in which the aggregation circuit 400 may be configured to operate. The aggregation circuit 400 includes a configuration register 406 that may be configured to selectively control the aggregation circuit 400 to operate in one of a first mode, a second mode, and a third mode. The configuration register 406 may be configured to include information about the aggregation zone in which the aggregation circuit 400 is included. The configuration register 406 may be programmed to include threshold information that is compared to indications of power consumption in the node 300 and, in some examples, also to indications of power consumption in other nodes in the aggregation zone. The aggregation circuit 400 includes a zone circuit 408 that receives signals on the inputs 402 and generates signals (described below) on the outputs 404. The aggregation circuit 400 also includes storage circuits 410P and 410C and selectors 412P and 412C. Signals 414P and 414C received at the inputs 402 may be employed to generate signals 416P and 416C on outputs 404. In some first examples, the signals 416P and/or 416C generated on the outputs 404 may be generated by the zone circuit 408 in a same cycle as the signals 414P and/or 414C are received at the inputs 402. In some second examples, the signals 416P and 416C are generated on the outputs 404 in a next cycle after the signals 414P and 414C are received on the inputs 402. In these second examples, the storage circuits 410P and 410C are provided to store the signals 416P and 416C for one or more cycles as needed. The selectors 412P and 412C are controlled by signals 418P and 418C to provide the signals 416P and 416C directly from the zone circuit 408 or from the storage circuits 410P and 410C. The storage circuits 410P and 410C may be employed due to timing constraints.
The aggregation circuits 400 in each of the plurality of nodes 502(1)-502(N) may be configured to receive an indication 514 of power consumption associated with the respective one of the nodes 502(1)-502(N). As an example, the aggregation circuit 400 in the node 502(1) receives the indication 514 of power consumption associated with the node 502(1). The indication 514 of power consumption associated with the node 502(1) may include an indication of a power supply voltage (e.g., on a power rail not shown here) that provides a power supply voltage to the node 502(1). In this regard, the IC chip 500 may include a voltage comparator 516 configured to compare the power supply voltage VPS on a power rail (not shown) in the IC chip 500 to one or more threshold voltages and generate an output to indicate whether the power supply voltage is higher or lower than a threshold or is within a range between thresholds to indicate a voltage level. The indication 514 may include the output from the voltage comparator 516.
Alternatively, the indication 514 may include or be based on one or more indications of one or more activities or events related to a data transmission from the node 502(1). For example, the indication 514 may be an indication of activities or events occurring or scheduled to occur in the processing circuits 306 shown in
In some examples, the aggregation circuit 400 in the node 502(1) may also receive indications 518 of activity or events in nodes adjacent to the node 502(1) (e.g., coupled to the node 502(1) by one of the segments 506), because activity and/or events in an adjacent node may be indicative of data transmissions that will also be occurring in the node 502(1). The indication 518 is one of the signals 414P that may also be received in the input 402 in the aggregation circuit 400 in
In the plurality of nodes 502(1)-502(N), in the absence of the aggregations circuits 400, the indications 514 and 518 may be employed to identify circumstances in which a di/dt event may be occurring or may occur, and the power consumption or rate of change of power consumption needs to be reduced. In such situations, the plurality of nodes 502(1)-502(N) may, as explained further below, reduce power consumption by inhibiting data transmissions on the segments 506. Reduction of power consumption may also be realized by reducing other activity in the plurality of nodes 502(1)-502(N), and the aggregation circuits 400 are not limited in this regard.
However, the plurality of nodes 502(1)-502(N), including the aggregation circuit 400, may communicate cooperatively to mitigate voltage droops in a region or regions of the IC chip 500 by configuring the aggregation circuits 400 in aggregation zones 504(1)-504(Z), as follows. In the aggregation zone 504(1), the aggregation circuit 400 of the leader node 510 is configured to operate in the leader mode, and the aggregation circuit 400 of the follower nodes 512(1)-512(F) is configured to operate in the second mode. It should be understood that the aggregation circuit 400 in any of the nodes 502(1)-502(N) may be configured to operate in either the first mode or the second mode, thus making many different aggregation zone configurations possible. It should also be understood that operations of the aggregation circuit 400 in the leader node 510 and the follower nodes 512(1)-512(F) may be referred to as the operations of the leader node 510 and the follower nodes 512(1)-512(F).
Once configured in the second mode of operation, rather than responding directly to the indications 514 and 518 by reducing their own data transmissions, each of the follower nodes 512(1)-512(F) generates, in the zone circuit 408 of the aggregation circuit 400, an indication 520 of the power consumption and provides the indication 520 on the outputs 404 of the aggregation circuit 400. In this example, the indication 520 is provided to the leader node 510 in the aggregation zone 504(1). The indication 520 may be one of the signals 416P in
In the first mode of operation, in addition to receiving its own indications 514 and 518, the leader node 510 receives the indications 520 of all the follower nodes 512(1)-512(F). Based on the indications 514, 518, and 520 received in the leader node 510, the aggregation circuit 400 in the leader node 510 generates a control signal 522 to reduce power consumption in the leader node 510 and all the follower nodes 512(1)-512(F). In other words, the control signal 522 may be employed to reduce power consumption in each of the plurality of nodes 502(1)-502(N) that are in the aggregation zone 504(1). The control signal 522 may be generated in the zone circuit 408 in
As in the aggregation zone 504(1), each of the aggregation zones 504(2)-504(Z) includes a respective leader node operating in the first mode to reduce power consumption in response to a control signal provided by the leader node in the follower nodes operating in the second mode.
Aggregation zone 804(1) includes a first, leader node 806 in which the aggregation circuit 400 (
In the aggregation circuit 400 in the middle node 810, the configuration register 406 is configured to selectively control the aggregation circuit 400 to operate in a third mode. It should be recognized that the aggregation circuit 400 in any of the nodes 802(1)-802(N) may be configured in any one of the first mode, the second mode, or the third mode, depending on the aggregation zones 804(1)-804(Z). In the third mode, the aggregation circuit 400 in the middle node 810 also receives indications 514 and 518, as shown in
In this example, the processor 1002 represents one or more general-purpose processing circuits, such as a microprocessor, central processing unit, or the like. The processor 1002 is configured to execute processing logic in instructions for performing the operations and steps discussed herein. In this example, the processor 1002 includes an instruction cache 1006 for temporary, fast access memory storage of instructions accessible by the instruction processing circuit 1004. Fetched or prefetched instructions from a memory, such as from the cache memory 1012 over a system bus 1010, are stored in the instruction cache 1006. The instruction processing circuit 1004 is configured to process instructions fetched into the instruction cache 1006 and process the instructions for execution.
The processor 1002 and the cache memory 1012 are coupled to the system bus 1010 and can intercouple peripheral devices included in the processor-based system 1000. As is well known, the processor 1002 communicates with these other devices by exchanging address, control, and data information over the system bus 1010. For example, the processor 1002 can communicate bus transaction requests to a memory controller 1014 in the main memory 1008 as an example of a slave device. Although not illustrated in
Other devices can be connected to the system bus 1010. As illustrated in
The processor-based system 1000 in
The embodiments disclosed herein include various steps. The steps of the embodiments disclosed herein may be formed by hardware components or may be embodied in machine-executable instructions, which may be used to cause a general-purpose or special-purpose processor programmed with the instructions to perform the steps. Alternatively, the steps may be performed by a combination of hardware and software.
The embodiments disclosed herein may be provided as a computer program product or software that may include a machine-readable medium (or computer-readable medium) having stored thereon instructions, which may be used to program a computer system (or other electronic devices) to perform a process according to the embodiments disclosed herein. A machine-readable medium includes any mechanism for storing or transmitting information in a form readable by a machine (e.g., a computer). For example, a machine-readable medium includes a machine-readable storage medium (e.g., ROM, random access memory (“RAM”), a magnetic disk storage medium, an optical storage medium, flash memories, etc.), and the like.
Unless specifically stated otherwise and as apparent from the previous discussion, it is appreciated that throughout the description, discussions utilizing terms such as “processing.” “computing,” “determining,” “displaying,” or the like refer to the action and processes of a computer system, or similar electronic computing device, that manipulates and transforms data and memories represented as physical (electronic) quantities within the computer system's registers into other data similarly represented as physical quantities within the computer system memories or registers or other such information storage, transmission, or display devices.
Those of skill in the art will further appreciate that the various illustrative logical blocks, modules, circuits, and algorithms described in connection with the aspects disclosed herein may be implemented as electronic hardware, instructions stored in memory or in another computer-readable medium wherein any such instructions are executed by a processor or other processing device, or combinations of both. The devices and components described herein may be employed in any circuit, hardware component, integrated circuit (IC), or IC chip, as examples. Memory disclosed herein may be any type and size of memory and may be configured to store any type of information desired. To clearly illustrate this interchangeability, various illustrative components, blocks, modules, circuits, and steps have been described above generally in terms of their functionality. How such functionality is implemented depends upon the particular application, design choices, and/or design constraints imposed on the overall system. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present disclosure.
The various illustrative logical blocks, modules, and circuits described in connection with the aspects disclosed herein may be implemented or performed with a processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or any combination thereof designed to perform the functions described herein. A processor may be a microprocessor, but in the alternative, the processor may be any conventional processor, controller, microcontroller, or state machine. A processor may also be implemented as a combination of computing devices (e.g., a combination of a DSP and a microprocessor, a plurality of microprocessors, one or more microprocessors in conjunction with a DSP core, or any other such configuration).
The aspects disclosed herein may be embodied in hardware and in instructions that are stored in hardware and may reside, for example, in Random Access Memory (RAM), flash memory, Read Only Memory (ROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), registers, a hard disk, a removable disk, a CD-ROM, or any other form of computer-readable medium known in the art. An exemplary storage medium is coupled to the processor such that the processor can read information from, and write information to, the storage medium. In the alternative, the storage medium may be integral to the processor. The processor and the storage medium may reside in an ASIC. The ASIC may reside in a remote station. Alternatively, the processor and the storage medium may reside as discrete components in a remote station, base station, or server.
It is also noted that the operational steps described in any of the exemplary aspects herein are described to provide examples and discussion. The operations described may be performed in numerous different sequences other than the illustrated sequences. Furthermore, operations described in a single operational step may actually be performed in a number of different steps. Additionally, one or more operational steps discussed in the exemplary aspects may be combined. It is to be understood that the operational steps illustrated in the flowchart diagrams may be subject to numerous different modifications, as will be readily apparent to one of skill in the art. Those of skill in the art will also understand that information and signals may be represented using any of a variety of different technologies and techniques. For example, data, instructions, commands, information, signals, bits, symbols, and chips that may be referenced throughout the above description may be represented by voltages, currents, electromagnetic waves, magnetic fields or particles, optical fields or particles, or any combination thereof.
The previous description of the disclosure is provided to enable any person skilled in the art to make or use the disclosure. Various modifications to the disclosure will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other variations without departing from the spirit or scope of the disclosure. Thus, the disclosure is not intended to be limited to the examples and designs described herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.