The present technology relates to power management in a semiconductor device.
In semiconductor technology, there is a limited supply of power which is available at a given time. In some cases, multiple die share a common power supply and require current to perform respective operations. If the requested current is not available, the operations may be corrupted or delayed.
FIG. 8B1 depicts an example arbitration process consistent with
FIG. 8B2 depicts a time line of an arbitration process, consistent with
FIG. 9A1 depicts a tree showing a priority threshold of selected (device, wait state) pairs in a binary search arbitration process where there are 32 possible (device, wait state) pairs.
FIG. 9A2 depicts a tree showing a priority threshold of selected (device, wait state) pairs in a binary search arbitration process where there are 16 possible (device, wait state) pairs.
Techniques are provided for efficiently managing the use of a power supply among competing devices. In one approach, the devices are separate die (or chips) in a multi-die stack or other multi-die package. Corresponding apparatuses are also provided.
There are various examples of electronic devices which share a common power supply. One example is multiple die in a semiconductor circuit. The die have contacts or connection points to the power supply, such as a pin or bond pad. In one approach, the die are in respective packages and each package has a pin which connects to the power supply. In another approach, multiple die are in one package and each die has a bond pad which connects to a common pin in a package, and that pin connects to the power supply. The contacts of the die may therefore be internal to the package.
One example of a die is used in a memory device and includes an array of memory cells. Other examples of die comprise integrated circuits which do not include a memory array. The die may have pins or other contacts for other purposes such as inter-die communications. In semiconductor manufacturing, a die is the area of the silicon wafer on which a functional circuit is fabricated. Many hundreds of identical dies are fabricated on each wafer. The term “die” can represent a single area of the silicon wafer or multiple areas of the silicon wafer. The term “dice” can also represent multiple areas of the silicon wafer.
Other examples of devices include peripherals that share a common power line. Peripherals can include a PCI Express or PCIe (Peripheral Component Interconnect Express) card, which is a high-speed serial computer expansion bus, and USB (Universal Serial Bus) devices on a common USB bus. The techniques are applicable to electronic devices that share a common bus which provides power and has a power budget. Typically, each electronic device has a dedicated contact such as a pin or bond pad that is connected to the power supply.
The peak current specification of a device is the maximum amount of current which is available. When there are multiple current-consuming devices, the current should be efficiently allocated among the different devices. The peak current specification may be violated if there are simultaneous high current operations. This can lead to a malfunction in the devices. For example, for a memory die, a lack of sufficient current can lead to an error in a read or write operation. The peak current specification sets a limit on the number of devices that can operate in parallel, impacting the system performance.
One approach is to use a central controller to schedule operations in the devices. For example, the controller can delay a request to one die until after another die has completed a requested action. Scheduling can be done on a predictive basis by being aware of the total current requirement across dies or by monitoring the real time impact of peak current. However, this requires additional communication between the controller and the devices and increases the processing burden of the controller. Moreover, the device may require an additional contact to receive a synchronizing signal from the controller. Thirdly, the controller cannot access internal operations of a die which are sequenced by the on-chip state machine. Even if the controller skews certain operations such as read/program/erase on different dies, the internal high current operations may again align in time causing violation in system ICC specification. Basically, the controller cannot predict the timing of internal high current operations for a given command.
Techniques for on-chip management of peak current are proposed herein which address the above and other issues. In one aspect, each device in a set of devices independently determines whether there is a sufficient amount of system current to enter a higher-current state. An initial determination can be made based on the available system current and an estimate of the current consumed in the higher-current state. This initial determination can be made internally within the device without initially affecting a load bus which is shared among the die. If the initial determination is successful, the device sources (adds or pulls up) a current on the load bus to signal to other devices that there will be a reduction in the available system current. The amount of current is equal to an estimated current consumption of the higher-current state. In case of a conflict with another device which concurrently sources current to the load bus, each device can independently perform an arbitration process to resolve the conflict. Example arbitration processes include linear and binary search algorithms in which each device has a priority based on its address and a count of a number of times it failed in the arbitration process. This can include random delay based arbitration also, as an example.
If the initial determination of whether there is additional available system current to enter a higher-current state is unsuccessful, the device enters a wait state and does not source an additional current on the load bus. This reduces the probability of conflicts, allowing the device to enter the higher-current state sooner. Performance can be improved by enabling more devices to operate in parallel and by reducing wait time for scheduling of internal operations. Various other features and benefits will be apparent in view of the following discussion.
In one approach, the devices are die which are connected in a stack and share common I/O pins and a power bus. The number of dies can be, e.g., 4, 8, 16 or 32. Some systems could have decimal die stacks. This system connects to a host through a frontend interface. A goal is to manage the operations across devices so as to control the peak current consumption from the common power supply which is shared across all devices and the controller.
In one approach, each device comprises input/output (I/O) contacts to receive/transmit commands and data, contacts for support functions (e.g., power, chip enable), other contacts which may be used only in test modes, an on-chip state machine which controls the internal operations of the chip, and other supporting circuits such as regulators, charge pumps, and oscillators. In one example, memory device includes a memory array to store data, and data path circuits to read/write data from I/O circuits to the memory array. One example of a memory device comprises memory cells arranged in a NAND configuration. See also
Each device may have a contact which communicates information regarding the current consumed by the device to all other devices. Each device may have a current detection circuit which judges whether the total current of all devices is within a peak current specification limit. An on-chip state machine may be provided which uses a flag output from the current detection circuit to schedule the internal operations of the device.
The state machine may provide logical values such as Sys_Peak_Icc and Bin_Peak_Icc on paths 304 and 305, respectively, to the Icc detection circuit. The Icc detection circuit may provide a flag FLG to the state machine on a path 306. Sys_Peak_Icc is the peak current specification of the power supply on the power supply line. This may be unique to a given system. Icc denotes current. Sys_Peak_Icc can be a three-bit value which is provided to the state machine by the controller. All the die or other devices connected in a die stack or other configuration can have the same value of Sys_Peak_Icc. See
Bin_Peak_Icc is an estimate made by the state machine of the current consumption of a present or future (next desired) state of the device. The state machine may have information such as a table which associates an estimated current consumption with each state of a plurality of available states that the state machine may enter. The state machine knows the present state and, in some cases, the next desired state. The functionality of the state machine could be performed by another entity such as a microcontroller. Bin_Peak_Icc can be a two-bit value such as depicted in
In the thirteen blocks 410, eleven of the blocks are shaded and these represent the current consumed by different devices. A common voltage on the load bus is sensed by the contact of each device, where this voltage is proportional to a sum of the currents in the multiple devices. For example, the blocks 411 represent Bin_Peak_Icc<1:0> in device 0, the blocks 412 represent Bin_Peak_Icc<1:0> in device 1, the block 413 represents Bin_Peak_Icc<1:0> in device 2, the blocks 414 represent Bin_Peak_Icc<1:0> in device 3 through device n−1, and the block 415 represents Bin_Peak_Icc<1:0> in device n. Each block represents a unit of current.
As part of the comparison circuit, the comparator 525 receives Vspec at one input and Vcontact at another input. If Vspec>=Vcontact, FLG=0. If Vspec<Vcontact, FLG=1. The comparator includes an inbuilt offset to ensure that FLG=0 when Vspec=Vcontact. FLG is input to the state machine 301. Outputs of the state machine include multi-bit codes including Sys_Peak_Icc<2:0> and Bin_Peak_Icc<1:0>. Sys_Peak_Icc<2:0> is provided on a path 510. With a three bit value, one bit is provided to transistors 514, another bit is provided to transistors 515 and another bit is provided to transistors 516 to set a current at a node 518. An additional current branch may be included as part of 513 to introduce an offset to the comparator. This ensures that the comparator gives an output of FLG=0 when Vspec=Vcontact. Vspec is provided based on this current and a resistor 517. Transistors 511 and 512 are used to generate a current which is mirrored to transistors 513. The gate of transistor 512 is an analog voltage which is generated by using an NMOS diode connected transistor in series with an on-chip current source. In one configuration, this on-chip current source may be temperature compensated for higher accuracy.
The adjusted system specification current (Sys_Peak_Icc<2:0>) is represented by a multi-bit code; and the comparison circuit is configured to generate a current based on each bit of the multi-bit code and sum the currents to provide the comparison voltage at an input to a comparator. For example, currents generated by the transistors 514, 515 and 516 are summed at the node 518. A current generated by the transistors 513 are also summed at the node 518. The resistor may be adjustable and trimmed. Vspec may be proportional to Sys_Peak_Icc<2:0>.
The comparator may have a wide input common mode voltage range, and may be designed to compare the voltage on the contact with the reference voltage, Vspec. The comparator may operate across a common mode range of, e.g., 0.5 V to 1.5 V. The output of the comparator (FLG) is an input to the on-chip state machine which does the scheduling of internal operations.
Bin_Peak_Icc<1:0> is provided on a path 520. With a two bit value, one bit is provided to transistors 521, and another bit is provided to transistors 522 to set a current at a node 519. This is a source current of the contact 111 which represents an estimate of the current used by the device in the present state or next state of the device. This current increases the current on the load bus and contact. Vcontact is the voltage of the contact and load bus.
The contact which is connected to all devices in the stack may have a pull down resistor (e.g., 2 kΩ) 523 in one of the devices. Using a switch 524, the resistor can be connected on the device with chip address 0. Each device dumps a current on this node. The magnitude of this current is proportional to the Icc state of the device in a present state or a next state (represented by Bin_Peak_Icc).
This current may be generated by mirroring a constant current with a zero temperature coefficient. A zero temperature coefficient current reference is generally available on-chip for other operations. In case a current source with a zero temperature source is not available on-chip, a current reference without temperature compensation can be used. This introduces a minimal error as the temperature variation across devices for a given system would not be much. (+/−1% error for a temperature difference of +/−5° C. across devices)
The voltage level on the contact (Vcontact) is proportionate to the sum of currents dumped on this node by each device. Hence it is proportionate to the sum of Icc consumed by each device. This voltage is compared to a reference voltage (Vspec) to judge whether or not the total system current is within the specification.
The state machine, to source current onto the load bus, is configured to generate a multi-bit or single-bit word (Bin_Peak_Icc<1:0>) representing the current consumption of the next state, to generate a current based on each bit of the multi-bit code and sum the generated currents. For example, currents generated by the transistors 521 and 522 are summed at the node 519.
The reference voltage is internally generated on each device. Each device has a pull down resistor connected to this node. The value of this resistor is chosen to be ten times that of the resistor connected to the contact (e.g., 20 kΩ). This is done to reduce current consumed by the Icc detection circuit on each device. It is trimmed to a value of 20 kΩ during testing in order to eliminate process variations. Temperature variations can be ignored as the temperature variation across devices is expected to be minimal.
A constant current proportionate to the system Icc specification is dumped on this node. The current is mirrored from a constant current source and is proportionate to Sys_Peak_Icc. A half LSB current is always dumped on this node when the circuit is on. This ensures that when current dumped on Vspec is exactly equal to Vcontact, FLG=0 so that there is no ambiguity in output level. It also reduces the reference error to +/− half LSB. Without this, error is 0 to −1 LSB.
This circuit compares an internal voltage, Vspec, to a voltage on a contact. In other cases, another value such as a current can be compared. Generally, each device may have a comparison circuit to compare a comparison value to a value of such a contact.
It is also possible for the state machine to enter a new state on its own. A decision step 551 determines if additional current is required. This can involving determining if Bin_Peak_Icc(new)>Bin_Peak_Icc(present). If decision step 551 is false, the device directly enters the new state at step 552 and, at step 552a, updates Vcontact by applying a current based on Bin_Peak_Icc. This is a smaller current than used for the previous state so that Vcontact will decrease, signaling to the other devices that additional current is available.
If decision step 551 is true, step 553 sets Sys_Peak_Icc and Vspec is updated accordingly. In one approach, the present value of Sys_Peak_Icc is decreased by the amount of the additional current (Bin_Peak_Icc(new)−Bin_Peak_Icc(old)). Sys_Peak_Icc is used to set Vspec, as discussed. At decision step 554, if Vspec>Vcontact, the device updates Vcontact at step 555 by applying a current based on Bin_Peak_Icc at step 555. This is a larger current than used for the old state so that Vcontact will increase, signaling to the other devices that less current is available. A decision step 556 determines whether there is a conflict with one or more other devices also requesting additional current. For example, a conflict may occur when another device updates its contact to consume more current at the same time. A conflict may be detected by monitoring FLG and observing that FLG transitions from 0 to 1 within a specified time period, e.g., a contact voltage settling time, after initially updating Vcontact. If decision step 556 is false, step 557 is reached, where the device enters the new state and consumes additional current. If decision step 556 is true, an arbitration process begins at step 558. If decision step 554 is false, the device cannot enter the new state and waits, or tries to enter another state, at step 559.
For example, the device may try to enter another state which consumes additional current relative to the present state but not as much current as the state which it unsuccessfully tried to enter. For instance, the state which it unsuccessfully tried to enter may involve a programming operation for memory cells, where the cells are programmed in a certain time period. The another state may also involve programming but at a slow rate. Or the another state may involve a programming operation for lower data states which consumes less current than programming of higher data states. Or the another state may involve a refresh programming operation rather than a full programming operation.
As an example, assume that the current available to the set of devices is 100 units (e.g., microamps). Sys_Peak_Icc can then be set initially to 100 units. Assume also that a device is in a present state which consumes 20 units of current and wishes to enter a new state which consumes 40 units of current. As a result, 40−20=20 additional units of current are desired. The device lowers Sys_Peak_Icc to 100−20=80 units, sets Vspec accordingly and compares Vspec to Vcontact. Assume Vcontact is at a voltage V1 which corresponds to 75 units of current. Since Vspec>Vcontact (80>75), FLG=0 and the device can proceed to the new state. In a further example, assume Vcontact is at a voltage V2 which corresponds to 85 units of current. Since Vspec<Vcontact (80<85), FLG=1 and the device cannot proceed to the new state.
However, assume there is another new state which consumes 30 units of current. The device can determine if entering this state is feasible. Here, 30−20=10 additional units of current are desired. The device lowers Sys_Peak_Icc to 100−10=90 units, sets Vspec accordingly and compares Vspec to Vcontact. Assume Vcontact is at the voltage V2 which corresponds to 85 units of current. Since Vspec>Vcontact (since 90>85), FLG=0 and the device can proceed to this new state.
The techniques described herein maximize the number of devices that can operate in parallel by considering the actual current consumption state of each device rather than considering the highest possible current consumption of a device. Moreover, the devices act in a decentralized way by deciding when they can enter a higher-current state. This frees the controller from issuing a suspend command to a device, for instance, if the voltage of the power bus drops below a certain level and a subsequent resume command when the voltage of the power bus increases. Other current-saving measures such as issuing a slow-down command to slow down the state machine clock or a charge pump clock, for instance, can also be avoided. Moreover, in some cases, a slow-down command cannot be used and the supply voltage may drop below a permissible limit resulting in data loss.
The use of a centralized arbitrator can also be avoided. Current consumed by each device can be digitally communicated to an arbitrator which may be present in the controller, for instance. However, this can result in frequent suspension of operations and degraded performance. Further, priority cannot be first come, first serve.
By adjusting Vspec to reflect the additional current consumption of the next state and comparing Vspec to Vcontact before adjusting Vcontact, in an internal check, the adjustment to Vcontact can be avoided in some cases, e.g., step 559. In contrast, omitting the internal check, directly updating Vcontact to reflect the additional current consumption and comparing this updated Vcontact to a fixed reference voltage can have disadvantages. For example, if two or more devices request a higher current and update their contacts accordingly at the same time, neither device is allowed to go to the higher-current state. Each device can retry going to a higher-current state after a fixed random time, but this increases the wait time. This wait time increases in proportion to the number of devices in the stack times and the time for the contact voltage to settle. Moreover, when Vspec exceeds the adjusted Vcontact, it is unknown to the device whether two or more devices are requesting additional current at the same time, or whether the additional current requested by one device alone exceeds the available current. This increases wait time, resulting in a performance impact.
In this example, Bin_Peak_Icc=00 corresponds to a chip standby mode in which a reference current Iref=0 V and a peak voltage Vpeak=0 V. Bin_Peak_Icc=01 corresponds to a first Icc state in which Iref=Iref1 and Vpeak=Vpeak1. Bin_Peak_Icc=10 corresponds to a second Icc state in which Iref=Iref2 and Vpeak=Vpeak2. Bin_Peak_Icc=11 corresponds to a third Icc state in which Iref=Iref3 and Vpeak=Vpeak3. Iref3>Iref2>Iref1 and Vpeak3>Vpeak2>Vpeak1.
For case=4, the number of devices is one, Sys_Peak_Icc is identified by 0 bits, the voltage step size is Vstep3 and the noise margin is NM3. For case=5, the number of devices is two, Sys_Peak_Icc is identified by 1 bit, the voltage step size is Vstep4 and the noise margin is NM4. For case=6, the number of devices is four, Sys_Peak_Icc is identified by 2 bits, the voltage step size is Vstep5 and the noise margin is NM5. Vstep5<Vstep4<Vstep3<Vstep2<Vstep1 and NM5<NM4<NM3<NM2<NM1. A larger noise margin is preferable.
The contact is shared across all devices and may have a capacitance of a few pF. The contact settling time may be up to about 500 nsec, for instance, across all voltage ranges and step sizes. The contact settling time is the time for a voltage at the contact to settle after changing.
Advantageously, in some embodiments, only one external pad is required for communicating Icc information among all the devices. The on-chip state machine provides information on the peak Icc specification for the system through Sys_Peak_Icc<2:0> and the Icc requirement of the next state through Bin_Peak_Icc<1:0>. The external pad has an on-chip trimmed pull down resistor (Rcontact) connected on device 0. Each of the devices in the stack sources a fixed current on to the contact, where the magnitude of this current depends on the magnitude of Icc in the current/next operation. The voltage on this contact is a result of a sum of currents sourced by all the devices. This voltage is compared with a reference voltage (Vspec) on each device to provide a measure of whether sum of Icc of all devices is within the system specification. Further, a reference voltage is generated by having an on-chip trimmed resistor on each of the devices. The resistor magnitude is a multiple of a resistor on the contact. This ensures that trim settings can be shared between these two resistors. The trimming process need not be repeated. The on-chip state machine processes the output flag of the comparator to decide whether the next operation can be done, or whether it needs to wait and/or enter an arbitration process such as described below.
In the flowcharts, T denotes true, F denotes false or fail, and P denotes pass.
The process begins at any state (block 700). If a standby state is true at decision step 701, an idle state is reached at block 702. If an active state is true at decision step 703, block 704 initializes BIN=0 and SYS=spec and block 705 initializes WAIT_CNT=0 and del_BIN=0 in a state A. del_BIN=0 is a delta or change in BIN, e.g., BIN_NS−BIN. Otherwise the idle state is maintained. Decision step 709 determines if BIN_NS is less than or equal to BIN. If decision step 709 is true, block 708 sets BIN=BIN_NS. This block is also reached if a pass status is set at block 706. In this case, the estimate current consumption in the next state is less than in the present state so the device can directly enter the next state without the concern of whether there is sufficient current available. The process then returns to block 705. If decision step 709 is false, block 710 sets del_BIN=BIN_NS−BIN (the additional current required by the new state relative to the present state) and SYS=spec-del_BIN (a reduction in SYS due to the additional current) in a state B. If decision step 713 is true (i.e., FLG=1), block 712 is reached where BIN=0 (the present value of current consumption is reset). If decision step 713 is false (i.e., FLG=0), block 714 is reached where BIN=BIN_NS (the present value of current consumption is set to the next state current consumption) and SYS=spec (the present value of SYS is reset to the specification level) in a state C.
Additionally, a decision step 707 determines if a wait has taken place over the contact settling time tD and FLG=0. tD is a specified period of time. If this decision step is true, a pass status is set at block 706 and block 705 is reached. If decision step 707 is false, a decision step 711 determines whether an arbitration process has a pass status (P). The arbitration process may run on clock cycle of tD, the contact settling time. This ensures that the contact voltages have settled during the process of arbitration. If there is a pass status, i.e., the device wins the arbitration and is allowed to go to the next, higher-current state, block 706 is reached. If there is a fail status, i.e., the device loses the arbitration and is not allowed to go to the next, higher-current state, block 715 is reached where BIN=0 and WAIT_CNT is incremented by one (as denoted by WAIT_CNT++) in a state D. Subsequently, block 710 is reached.
The arbitration process may use a linear or binary search algorithm, for example, as described further below. For a linear algorithm, there may be 32 cycles with one wait state for a 16-die stack, and for a binary algorithm there may be 5 cycles with one wait state for a 16-die stack.
Blocks 705, 710, 714 and 715 denote states A, B, C and D, respectively, of the state machine.
In the random delay arbitration, when FLG becomes 1 after updating BIN, each of the contesting devices set their Icc state to 0 and enter a wait state. The devices then enter a higher Icc state after a random delay. This greatly reduces the probability of the contesting devices probing for a higher Icc simultaneously the next time. The higher the maximum random delay, the lower the probability of the contesting devices updating Icc at the same time again. A lower delay reduces the wait time during arbitration.
The random delay arbitration process is represented at block 720 and state D. BIN is set to 0 and WAIT is performed using a random delay.
WAIT_CNT is the number of times a device had to go back to state-B (block 710 in
The allocation of a unique priority for each combination of device and wait state ensures that a single device wins the arbitration process.
FIG. 8B1 depicts an example linear search arbitration process consistent with
FIG. 8B2 depicts a time line of an arbitration process, consistent with
The arbitration process can be repeated in another iteration if necessary. See, e.g., step 558 of
Decision step 730 determines if (CNT<N−C+N*WAIT_CNT) AND FLG=1 AND WAIT_CNT<4. If the decision step is true, CNT is incremented at block 731. CNT is a device address based counter which counts from 1 to (N−C+N*WAIT_CNT. This loop continues until decision step 730 is false, e.g., when CNT is sufficiently high, FLG=0 and/or WAIT_CNT>=4 or other maximum level. CNT is sufficiently high when the number of clock cycles for a device reaches the priority of the device. After that, the device waits until the arbitration process is complete, if the device has lost the arbitration process. If FLG=0 before CNT is sufficiently high, then the device is said to have won the arbitration. WAIT_CNT=4 when the device has waited the maximum number of times.
Subsequently, decision step 732 determines if (CNT=N−C+N*WAIT_CNT) AND FLG=1 AND WAIT_CNT<4. This is like the condition in decision step 730 except the < is replaced by =. If decision step 730 is false, the pass block 706 is reached, indicating that the device has won the arbitration and can enter the new state. See also block 708. Decision step 730 is false if CNT indicates the number of clock cycles for the device reaches the priority of the device, FLG=0 and/or WAIT_CNT>=4 or other maximum level.
If decision step 732 is true, the device loses the arbitration and block 715 sets BIN_CS=0 and CNT=0 and increments WAIT_CNT. The updated value of WAIT_CNT will be used in a next arbitration process for the device at decision steps 730 and 732.
Compared to the process of
FIG. 9A1 depicts a tree showing a priority threshold of selected (device, wait state) pairs in a binary search arbitration process where there are 32 possible (device, wait state) pairs. The example is consistent with the priority numbers shown in
FIG. 9A2 depicts a tree showing a priority threshold of selected (device, wait state) pairs in a binary search arbitration process where there are 16 possible (device, wait state) pairs. Here, m=4 and 2̂4=16. For example, for n=2, 3 or 4, the number of (device, wait state) pairs decreases or increases by 4 (i.e., 2̂(4−2), 2 (i.e., 2̂(4−3)) or 1 (i.e., 2̂(4−4)), respectively.
A decision step determines if the process is on the last iteration. If decision step 920 is false, step 919 increments n and steps 912 follows in a next iteration. If decision step 920 is true, step 921 sets a FAIL status for the device if a PASS status has not been set previously in the process (the device loses the arbitration).
Thus, the state machine is configured to perform an arbitration process if the flag transitions from the first value (0) to the second value (1) before a specified period of time (e.g., a contact settling time) expires, indicating a conflict between two or more of the devices. The arbitration process may comprise a binary search which is completed in m clock cycles of the state machine, where 2̂m is a number of the multiple devices multiplied by a number of wait states, and each wait state represents a number of times the one device has failed the arbitration process. The arbitration process may assign a unique priority to each combination of device and wait state, where each wait state represents a number of times each device has failed the arbitration process. For linear arbitration, the arbitration process ends when the flag transitions from the second value (1) to the first value (0), indicating no conflict between the devices. For binary arbitration, the arbitration process ends after m clock cycles.
Initially all 16 pairs are selected. If FLG=1, then all devices enter the binary priority search algorithm. Let the cycle number be denoted by n. ‘n’ is incremented from 1 to 5. CNT is a counter which is initialized to 2̂m at the start of the algorithm. In every cycle, CNT is updated as: CNT=CNT+/−2̂(m−n). In each cycle +/− depends on FLG of the previous cycle. If FLG=1, ‘−’ is chosen. If FLG=0, ‘+’ is chosen. Statuses of each pair in each cycle depend on whether its priority (p) is > or <= CNT. If p>CNT, the status is “new state” and the device can update the contact if necessary. If p < or = CNT, the status is “previous state” and the device may revert to lower current state if necessary. For a contesting device, if status=new state and FLG=0 after settling time, it goes to a PASS state, and the device can go ahead with higher Icc operation. If FLG=1 and n=m, and the contesting device has not gone to the PASS status previously, then it will go to the FAIL state.
For a non-contesting device, if FLG =1, it knows that it needs to enter the WAIT state for ‘m’ cycles before carrying out any internal Icc check/contact update.
The maximum value of WAIT_CNT, max WAIT_CNT, can be configurable, but it should be set by a parameter during device-sort or based on a command through common interface. Max WAIT_CNT may be common between all devices. WAIT_CNT can range between 0 and max WAIT_CNT. The number of cycles in the binary priority search algorithm is defined by max WAIT_CNT. In general, it is very improbable to go to higher wait counts. Setting the max WAIT_CNT to two or three is sufficient in many implementations.
In this specific example, the table has rows 1-8 and columns (col.) 1-16. Row 1 identifies a combination of a device (D) and a wait state (W, also referred to as WAIT_CNT), e.g., as a data pair: (selected device, wait state). This example has eight devices (0-7) and two wait states, W=0 and 1. If additional wait states are being used, the table will have additional columns. The number of columns is number of devices multiplied by the number of wait states. The binary search process can significantly reduce the duration of the arbitration process, compared to the linear search. For example, the binary search can be completed in four clock cycles (rows 4-7) in this example compared to 16 clock cycles for a comparable linear search. Generally, the binary search can be completed in m clock cycles, where 2̂m is the number of different (selected device, wait state) pairs or combinations. 2̂m is also is a number of devices multiplied by a number of wait states, where each wait state represents a number of times the device has failed the arbitration process.
Row 2 identifies a priority of a device, similar to what was provided at
Rows 3-7 each indicate a requested current BIN in a respective clock cycle, where BIN=BIN_CS is a current of a present state (CS=current state or present state), and BIN_NS is a current of a next (new), higher-current state. Rows 3-7 each represent one clock cycle which may be approximately equal to the contact settling time tD. A value of FLG is also indicated. The value of FLG value in each row is a result of the sum of Icc in same row.
Row 8 indicates a final result of pass or fail for the contesting device in the arbitration process.
A contesting device is one that wishes to go to a state that has a higher Icc requirement compared to current state. It is indicated by setting BIN=BIN_NS. All other (device, wait state) pairs continue to remain in the same Icc state, as indicated by BIN=BIN_CS.
A box is provided in each row for each (device, wait state) pair. A box can be shaded or unshaded. The shaded boxes represent selected (device, wait state) pairs. The binary search changes the selected (device, wait state) pair in each iteration, as discussed in
A value of priority (p) is generated by priority logic described earlier (
With max priority state=16, the priority between any two or more contesting devices is decided in only 4 cycles. If number of devices is 16 or 32, only one or two more cycles are needed.
Initially FLG=0. At this stage, devices 3 and 4 with wait state 0 have updated Icc on the contact simultaneously, resulting in FLG=1 in Row 3. The arbitration process thus begins with a first iteration (n=1) in Row 4. In Row 4, both contesting devices have unshaded boxes indicating they are not selected; hence they update BIN=0. This changes FLG to 0 at Row 4. The second iteration is depicted in Row 5. In Row 5, device 3 updates its BIN to BIN_NS since it has a shaded box and is thus selected. After this, FLG remains at 0 in Row 5. This means that device 3 can go ahead with the next higher Icc operation and it moves to the pass status in Row 6. The third iteration is depicted in Row 6. In Row 6, device 4 has a shaded box and is thus selected, so it updates BIN=BIN_NS. As a result, FLG=1 in Row 6. The fourth and last iteration is depicted in Row 7. In Row 7, device 4 has a shaded box and is thus selected, so it retains BIN=BIN_NS. As a result, FLG=1 in Row 7. As a result, device 4 cannot go ahead with its high Icc operation and enters the fail state at Row 8.
The techniques provided herein improve system performance by efficient peak current management of a set of devices, allowing more devices to operate in parallel. The techniques are achieved by managing timing of internal operations in a device, where these internal operations are not accessible to a controller external to the device, in one approach. Further, one embodiment uses only one contact for current management. For example, an existing test contact can be reused for this purpose. Hence, there is no requirement of adding a new contact.
Moreover, peak current management can be performed independently on the device. Hence, there is no change in an interface specification between the device and a controller, and no involvement of the controller. System peak current specification can be set using parameters, and this can vary different for different systems. Another advantage is that no current is consumed by the peak Icc detection circuit on the device when it is in a standby mode.
Further, all active devices in the system are always aware of the total Icc consumed by the set of devices. If a device wants to go to a higher Icc state, it can quickly check the feasibility of doing this by reducing the internal specification rather than updating Icc on the contact. This avoids waiting for the contact voltage to settle each time such a check is made. This makes the process of checking for Icc budget a continuous event rather than a process that needs to be repeated at every fixed interval. The checking can be repeated at the internal state machine frequency, for instance. This also ensures that the external I/O (e.g., the load bus) is not disturbed unless a device actually goes to a higher Icc state.
System Icc specification and a device's Icc state are controlling voltage levels of two different nodes. This provides a wider voltage range, more noise margin and flexibility in design, compared to a case where the Icc state and specification are controlling voltage levels on the same node and reference voltage level is fixed.
The output of an internal comparator of a non-contesting device goes high only when two or more devices request a higher Icc simultaneously. This is a low probability event and triggers the arbitration process. The techniques described avoid triggering an arbitration process when only one device is requesting a higher Icc. The arbitration process can uses a binary search algorithm to arbitrate between two or more devices which request a higher Icc at the same time. The arbitration process takes into account the number of times a device had to wait.
In another approach which reduces complexity, random delay arbitration process can be used.
Another advantage is that, if two or more devices are contesting for a higher Icc at the same time and the total Icc for all devices is within the system specification, they can go to the higher Icc state simultaneously. Wait time is needed only when the system specification is violated.
In implementing the technique on a device, the logic complexity is modest since the addition of Icc of all devices is done in an analog circuit.
In a further aspect, if a certain operation cannot be supported due to Icc constraints, the operation can be slowed down instead of stopping. This can be done internally within the device without involvement of the contact. This is done by lowering the specification by a smaller ΔIcc if FLG of the contesting device becomes 1. See also step 559 of
The control circuitry 1010 cooperates with the read/write circuits 1065 to perform operations on the memory array. The control circuitry 1010 includes a state machine 1012, an on-chip address decoder 1014 and a power control circuit 1016. In an example embodiment, the power control circuit 1016 is a step-down regulated charge pump for supplying a logic voltage, e.g., 1.2 V logic, in a non-volatile storage product. In another example embodiment, the power control circuit 1016 is a step-up regulated charge pump which supports a 1.8 V host in a non-volatile storage product.
The state machine 1012 provides chip-level control of memory operations. For example, the state machine may be configured to perform read and verify processes. The on-chip address decoder 1014 provides an address interface between that used by the host or a memory controller to the hardware address used by the decoders 1030 and 1060. The power control circuit 1016 controls the power and voltages supplied to the word lines and bit lines during memory operations.
In some implementations, some of the components of
The data stored in the memory array is read out by the column decoder 1060 and output to external I/O lines via the data I/O line and a data input/output buffer. Program data to be stored in the memory array is input to the data input/output buffer via the external I/O lines. Command data for controlling the memory device are input to the controller 1050. The command data informs the flash memory of what operation is requested. The input command is transferred to the control circuitry 1010. The state machine 1012 can output a status of the memory device such as READY/BUSY or PASS/FAIL. When the memory device is busy, it cannot receive new read or write commands.
In another possible configuration, a non-volatile memory system can use dual row/column decoders and read/write circuits. In this case, access to the memory array by the various peripheral circuits is implemented in a symmetric fashion, on opposite sides of the array, so that the densities of access lines and circuitry on each side are reduced by half
In an erase operation, a high voltage such as 20 V is applied to a substrate on which the NAND string is formed to remove charge from the storage elements. During a programming operation, a voltage in the range of 12-21 V is applied to a selected word line. In one approach, step-wise increasing program pulses are applied until a storage element is verified to have reached an intended state. Moreover, pass voltages at a lower level may be applied concurrently to the unselected word lines. In read and verify operations, the select gates (SGD and SGS) are connected to a voltage in a range of 2.5 to 4.5 V and the unselected word lines are raised to a read pass voltage, Vread, (typically a voltage in the range of 4.5 to 6 V) to make the transistors operate as pass gates. The selected word line is connected to a voltage, a level of which is specified for each read and verify operation, to determine whether a Vth of the concerned storage element is above or below such level.
Each program voltage includes two steps, in one approach. Further, Incremental Step Pulse Programming (ISPP) is used in this example, in which the program voltage steps up in each successive program loop using a fixed or varying step size. This example uses ISPP in a single programming pass in which the programming is completed. ISPP can also be used in each programming pass of a multi-pass operation.
The waveform 1100 includes a series of program voltages 1101, 1102, 1103, 1104, 1105, . . . 1106 that are applied to a word line selected for programming and to an associated set of non-volatile memory cells. One or more verify voltages can be provided after each program voltage as an example, based on the target data states which are being verified. 0 V may be applied to the selected word line between the program and verify voltages. For example, S1- and S2-state verify voltages of VvS1 and VvS2, respectively, (waveform 1110) may be applied after each of the program voltages 1101 and 1102. S1-, S2- and S3-state verify voltages of VvS1, VvS2 and VvS3 (waveform 1111) may be applied after each of the program voltages 1103 and 1104. After several additional program loops, not shown, S5-, S6- and S7-state verify voltages of VvS5, VvS6 and VvS7 (waveform 1112) may be applied after the final program voltage 1106.
Accordingly, in one embodiment, an apparatus comprises: a comparison circuit having a first contact connected to a load bus and having a second contact connected to a power supply line; and a state machine in communication with the comparison circuit, the state machine configured to generate a comparison value based on system specification which has been pre-configured on non-volatile memory during device-sort or based on a command issued by a controller. The state machine is also configured to generate an estimated current consumption for a next state and configured to operate the comparison circuit to compare the comparison value to a value of the first contact, wherein the power supply line and the load bus are common to multiple devices.
In another embodiment, a method comprises: receiving a command to enter a next operation at a device, the command is received from a controller which is external to the device; internal command sequencing done by an on-chip state machine; the state machine determining a difference between an estimated current consumption of the next state and an estimated current consumption of a current state; decreasing a system specification current by the difference to provide an adjusted system specification current; providing a comparison value based on the adjusted system specification current; comparing the comparison value to a value of a load bus, the load bus shared by multiple devices; and based on the comparing, deciding whether to update difference current on load bus and enter the next state.
In another embodiment, an apparatus comprises: means for providing power to a set of devices using a common power supply line; means for connecting contacts of each device of the set of devices with one another; and means for instructing a device of a set of devices to transition from a present state to a next state, wherein the next state consumes more current than the present state, and the one device, to determine whether the power is sufficient to allow the device to transition from the present state to the next state, is configured to generate a comparison value based on an estimated current consumption for the next state, and compare the comparison value to a value of the means for connecting.
The foregoing detailed description of the invention has been presented for purposes of illustration and description. It is not intended to be exhaustive or to limit the invention to the precise form disclosed. Many modifications and variations are possible in light of the above teaching. The described embodiments were chosen to best explain the principles of the invention and its practical application, to thereby enable others skilled in the art to best utilize the invention in various embodiments and with various modifications as are suited to the particular use contemplated. It is intended that the scope of the invention be defined by the claims appended hereto.
Number | Date | Country | Kind |
---|---|---|---|
201641007351 | Mar 2016 | IN | national |