Power consumption is of particular concern in limited-power devices (e.g., battery-powered devices) such as laptops and notebooks, smart phones, touchscreen devices, gaming consoles, and the like. These devices are limited in size and weight and generally portable, and therefore they typically use smaller and lighter batteries of increasing but still limited capacity.
One of the most widely used and well known techniques to reduce power consumption is referred to as clock gating. In essence, with clock gating, the clock signal is not propagated to a clocked component when the component is idle (not being used). Propagation of the clock signal is controlled by a gate (the “clock gate”) coupled between the clock signal generator and the clocked component. When the clock gate is enabled, then the clock signal is prevented from reaching the clocked component. Power is conserved because, for example, elements of the clocked component no longer change state in response to the rising and/or falling edges of the clock signal.
A problem with clock gating is that it can accelerate aging of the clocked components. More specifically, it is well known that clock gating can accelerate an aging mechanism known as bias temperature instability (BTI). BTI can actually increase power consumption over time; in general, BTI can change characteristics of the component such as its threshold voltage, making it necessary to increase supply voltage with age or to provide margin for aging at any time, thus increasing leakage and active power or reducing performance if frequency needs to be clamped. Thus, while clock gating is intended to have a positive effect, over the long term it can introduce some negative effects.
BTI can also lead to increased duty cycle distortion on clock distribution networks that are gated to save power. This distortion can affect the frequency of operation of logic devices and circuits and input/output devices that have duty cycle requirements.
Embodiments according to the present disclosure replace conventional clock gating with clock slowdown—actually, extreme clock slowdown. In essence, when a clocked component is idle and conventionally would be clock gated, the component is instead supplied a clock signal that is much slower than the clock signal it receives during normal operation. In effect, by toggling the component with a slow clock signal even when the component is idle, BTI aging can be slowed.
More specifically, in one embodiment, while a clocked component is not idle, the component receives a clock signal that is at a first frequency. When the clocked component is idle (in a state in which it can be clock gated), the clock signal is changed to a non-zero second frequency that is less than the first frequency. The first frequency corresponds to the normal clock frequency supplied to the component. Thus, under circumstances in which a clocked component would conventionally be clock-gated and prevented from receiving a clock signal, the clocked component instead continues to receive a clock signal, specifically a very slow clock signal.
In one embodiment, a controller can access information that indicates whether the clocked component is idle. In response to the controller determining that the clocked component is idle, the clock signal supplied to the component is reduced to the second frequency as described above. Otherwise (while the clocked component is not idle), the clock signal is at the first frequency.
In one embodiment, a clock switch circuit is used to select the clock signal received by the clocked component from among a first clock signal having the first frequency and a second clock signal having the second frequency. The first clock signal is selected if the component is not idle; the second clock signal is selected if the component is idle. The selected clock signal is then propagated to the clocked component.
In one embodiment, if the clocked component is idle, then a frequency divider is used to divide a first clock signal having the first frequency to produce a second clock signal having the second frequency. The second clock signal is then propagated to the clocked component; otherwise (if the component is not idle), the first (undivided) clock signal is propagated to the component.
By using a slower clock in place of conventional clock gating (in which no clock signal is propagated) when a component is idle, BTI aging is slowed. Furthermore, power savings that are only incrementally lower than the power savings provided by clock gating continue to be realized.
Embodiments according to the present disclosure can be utilized at any level in the clock distribution network(s). For example, the low frequency clock signal can be inserted at the phase-locked loop (PLL) root of a clock tree or in place of engine-level clock gating, block-level clock gating, and/or leaf-level clock gating.
These and other objects and advantages of the various embodiments of the present disclosure will be recognized by those of ordinary skill in the art after reading the following detailed description of the embodiments that are illustrated in the various drawing figures.
The accompanying drawings, which are incorporated in and form a part of this specification and in which like numerals depict like elements, illustrate embodiments of the present disclosure and, together with the description, serve to explain the principles of the disclosure.
Reference will now be made in detail to the various embodiments of the present disclosure, examples of which are illustrated in the accompanying drawings. While described in conjunction with these embodiments, it will be understood that they are not intended to limit the disclosure to these embodiments. On the contrary, the disclosure is intended to cover alternatives, modifications and equivalents, which may be included within the spirit and scope of the disclosure as defined by the appended claims. Furthermore, in the following detailed description of the present disclosure, numerous specific details are set forth in order to provide a thorough understanding of the present disclosure. However, it will be understood that the present disclosure may be practiced without these specific details. In other instances, well-known methods, procedures, components, and circuits have not been described in detail so as not to unnecessarily obscure aspects of the present disclosure.
Some portions of the detailed descriptions that follow are presented in terms of procedures, logic blocks, processing, and other symbolic representations of operations on data bits within a computer memory. These descriptions and representations are the means used by those skilled in the data processing arts to most effectively convey the substance of their work to others skilled in the art. In the present application, a procedure, logic block, process, or the like, is conceived to be a self-consistent sequence of steps or instructions leading to a desired result. The steps are those utilizing physical manipulations of physical quantities. Usually, although not necessarily, these quantities take the form of electrical or magnetic signals capable of being stored, transferred, combined, compared, and otherwise manipulated in a computing system. It has proven convenient at times, principally for reasons of common usage, to refer to these signals as transactions, bits, values, elements, symbols, characters, samples, pixels, or the like.
It should be borne in mind, however, that all of these and similar terms are to be associated with the appropriate physical quantities and are merely convenient labels applied to these quantities. Unless specifically stated otherwise as apparent from the following discussions, it is appreciated that throughout the present disclosure, discussions utilizing terms such as “providing,” “propagating,” “determining,” “changing,” “enabling,” “disabling,” “selecting,” “dividing,” “transitioning,” “aligning,” “receiving,” “accessing,” “controlling,” “decreasing,” “increasing,” “reducing,” or the like, refer to actions and processes (e.g., flowchart 600 of
Embodiments described herein may be discussed in the general context of computer-executable instructions residing on some form of computer-readable storage medium, such as program modules, executed by one or more computers or other devices. By way of example, and not limitation, computer-readable storage media may comprise non-transitory computer-readable storage media and communication media; non-transitory computer-readable media include all computer-readable media except for a transitory, propagating signal. Generally, program modules include routines, programs, objects, components, data structures, etc., that perform particular tasks or implement particular abstract data types. The functionality of the program modules may be combined or distributed as desired in various embodiments.
Computer storage media includes volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer-readable instructions, data structures, program modules or other data. Computer storage media includes, but is not limited to, random access memory (RAM), read only memory (ROM), electrically erasable programmable ROM (EEPROM), flash memory or other memory technology, compact disk ROM (CD-ROM), digital versatile disks (DVDs) or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium that can be used to store the desired information and that can accessed to retrieve that information.
Communication media can embody computer-executable instructions, data structures, and program modules, and includes any information delivery media. By way of example, and not limitation, communication media includes wired media such as a wired network or direct-wired connection, and wireless media such as acoustic, radio frequency (RF), infrared, and other wireless media. Combinations of any of the above can also be included within the scope of computer-readable media.
In its most basic configuration, the computing system 100 may include at least one processor 102 (CPU) and at least one memory 104. The processor 102 generally represents any type or form of processing unit capable of processing data or interpreting and executing instructions. In certain embodiments, the processor 102 may receive instructions from a software application or module. These instructions may cause the processor 102 to perform the functions of one or more of the example embodiments described and/or illustrated herein.
The memory 104 generally represents any type or form of volatile or non-volatile storage device or medium capable of storing data and/or other computer-readable instructions. In certain embodiments the computing system 100 may include both a volatile memory unit (such as, for example, the memory 104) and a non-volatile storage device (not shown).
The computing system 100 also includes a display device 106 that is operatively coupled to the processor 102. The display device 106 is generally configured to display a graphical user interface (GUI) that provides an easy to use interface between a user and the computing system.
The computing system 100 also includes an input device 108 that is operatively coupled to the processor 102. The input device 108 may include a touch sensing device (a touch screen) configured to receive input from a user's touch and to send this information to the processor 102. The processor 102 interprets the touches in accordance with its programming.
An input device 108 may be integrated with the display device 106 or they may be separate components. In the illustrated embodiment, the input device 108 is a touch screen that is positioned over or in front of the display device 106. The input device 108 and display device 106 may be collectively referred to herein as a touch screen display 107.
The communication interface 122 of
As illustrated in
Many other devices or subsystems may be connected to computing system 100. Conversely, all of the components and devices illustrated in
The computer-readable medium containing the computer program may be loaded into the computing system 100. All or a portion of the computer program stored on the computer-readable medium may then be stored in the memory 104. When executed by the processor 102, a computer program loaded into the computing system 100 may cause the processor 102 to perform and/or be a means for performing the functions of the example embodiments described and/or illustrated herein. Additionally or alternatively, the example embodiments described and/or illustrated herein may be implemented in firmware and/or hardware.
Embodiments according to the present disclosure replace clock gating with clock slowdown—actually, extreme clock slowdown. In essence, when a clocked component is idle and conventionally would be clock gated, the component is instead supplied a clock signal that is much slower than the clock signal it receives during normal operation. The discussion below pertains to systems that conventionally use clock gating to reduce power (versus functional clock gating where circuit operation depends on the gating of clocks for correct operation).
The controller 204 can access information (e.g., workload information and/or idleness information) that indicates whether or not the clocked component 204 is idle. Based on the workload/idleness information, the controller 202 can issue a command or commands that control the frequency of the clock signal received by the clocked component 204, as will be presented below in more detail.
The term “idle” is used herein to mean that the clocked component 204 is in a state in which it would conventionally be clock gated. In other words, as will be seen, in embodiments according to the present disclosure, the clocked component 204 will continue to receive a clock signal when, conventionally, it would be clock gated; consequently, the clocked component will change state (toggle) in response to the rising and/or falling edges of the clock signal. However, while the clocked component is being toggled in response to the slow clock signal, the component is not receiving or acting on a data signal, and/or is not outputting useful data, and in that sense is idle.
The controller 202 may be implemented on or by the processor 102 of
The clock switch circuit 302 can receive a select signal (SEL) either directly or indirectly from the controller 202 (
With reference to
In operation, workload information indicating whether or not the clocked component 204 is idle (in a state in which, conventionally, it can be clock gated) or not idle is accessed and monitored by the controller 202. If the clocked component 204 is not idle, then the first clock signal (the faster clock) is selected and propagated to the clocked component. If, on the other hand, the clocked component 204 is idle (in a condition that would conventionally result in the component being clock gated), then the second clock signal (the slower clock) is selected and propagated to the clocked component. Thus, even though the clocked component 204 is not being used, it continues to receive a clock signal—a slow clock signal, on the order of, though not limited to, 20-50 kHz.
In block 602 of
In block 604, a determination is made as to whether the clocked component is idle (e.g., in a state in which it could conventionally be clock gated).
In block 606, in response determining that the clocked component is idle, the clock signal is changed to a non-zero second frequency that is less than the first frequency. The clocked component continues to receive a clock signal (at the second frequency) even though the component is idle.
In block 608, if the clocked component is not idle, then it continues to receive the clock signal at the first frequency.
In summary, embodiments according to the present disclosure replace conventional clock gating with clock slowdown. When a clocked component is idle and conventionally would be clock gated, the component is instead supplied a clock signal that is much slower than the clock signal it receives during normal operation. By toggling the component with a slow clock signal even when the component is idle (that is, the component changes state in response to the rising and/or falling edges of the slow clock signal even while idle), BTI aging can be slowed. Power savings that are only slightly less than the power savings provided by conventional clock gating continue to be realized, as dynamic power at the slower clock rate (e.g., 20-50 kHz) is negligible; that is, use of a slower clock as disclosed herein only slightly increases power consumption relative to conventional clock gating.
While the foregoing disclosure sets forth various embodiments using specific block diagrams, flowcharts, and examples, each block diagram component, flowchart step, operation, and/or component described and/or illustrated herein may be implemented, individually and/or collectively, using a wide range of hardware, software, or firmware (or any combination thereof) configurations. In addition, any disclosure of components contained within other components should be considered as examples because many other architectures can be implemented to achieve the same functionality.
The process parameters and sequence of steps described and/or illustrated herein are given by way of example only. For example, while the steps illustrated and/or described herein may be shown or discussed in a particular order, these steps do not necessarily need to be performed in the order illustrated or discussed. The various example methods described and/or illustrated herein may also omit one or more of the steps described or illustrated herein or include additional steps in addition to those disclosed.
While various embodiments have been described and/or illustrated herein in the context of fully functional computing systems, one or more of these example embodiments may be distributed as a program product in a variety of forms, regardless of the particular type of computer-readable media used to actually carry out the distribution. The embodiments disclosed herein may also be implemented using software modules that perform certain tasks. These software modules may include script, batch, or other executable files that may be stored on a computer-readable storage medium or in a computing system. These software modules may configure a computing system to perform one or more of the example embodiments disclosed herein. One or more of the software modules disclosed herein may be implemented in a cloud computing environment. Cloud computing environments may provide various services and applications via the Internet. These cloud-based services (e.g., software as a service, platform as a service, infrastructure as a service, etc.) may be accessible through a Web browser or other remote interface. Various functions described herein may be provided through a remote desktop environment or any other cloud-based computing environment.
The foregoing description, for purpose of explanation, has been described with reference to specific embodiments. However, the illustrative discussions above are not intended to be exhaustive or to limit the invention to the precise forms disclosed. Many modifications and variations are possible in view of the above teachings. The embodiments were chosen and described in order to best explain the principles of the invention and its practical applications, to thereby enable others skilled in the art to best utilize the invention and various embodiments with various modifications as may be suited to the particular use contemplated.
Embodiments according to the invention are thus described. While the present disclosure has been described in particular embodiments, it should be appreciated that the invention should not be construed as limited by such embodiments, but rather construed according to the below claims.