Method and apparatus for atomic frequency and voltage changes

Information

  • Patent Grant
  • 9348407
  • Patent Number
    9,348,407
  • Date Filed
    Wednesday, November 12, 2014
    10 years ago
  • Date Issued
    Tuesday, May 24, 2016
    8 years ago
Abstract
A method and apparatus for atomic frequency and voltage changes in the processor. In one embodiment of the invention, the atomic frequency and voltage changes in the processor is feasible due to the enabling technology of fully integrated voltage regulators (FIVR) that are integrated in the processor. FIVR allows independent configuration of each core in the processor and the configuration includes, but is not limited to, voltage setting, frequency setting, clock setting and other parameters that affects the power consumption of each core.
Description
FIELD OF THE INVENTION

This disclosure pertains to a processor, as well as code to execute thereon, and in particular but not exclusively, to a method and apparatus for atomic frequency and voltage changes in the processor.


BACKGROUND DESCRIPTION

In some processors, a ratio change of the clock source requires the voltage source of the processor and the ratio of the clock source to change in a certain order with each other. For example, in order to change the clock source to a higher frequency, the voltage source of the processor first has to be raised. The processor waits for the voltage source to be raised and once the voltage source of the processor has been raised to the new level, the ratio of the clock source can be changed to increase the frequency of the clock source.


The current sequence of steps described earlier to change the ratio of the clock source is not optimal as there is a long delay in the waiting time.





BRIEF DESCRIPTION OF THE DRAWINGS

The features and advantages of embodiments of the invention will become apparent from the following detailed description of the subject matter in which:



FIG. 1 illustrates the block diagram of a processor in accordance with one embodiment of the invention;



FIG. 2 illustrates the operations to change the clock ratio in a processor in accordance with one embodiment of the invention; and



FIG. 3 illustrates a system to implement the methods disclosed herein in accordance with one embodiment of the invention.





DETAILED DESCRIPTION

Embodiments of the invention described herein are illustrated by way of example and not by way of limitation in the accompanying figures. For simplicity and clarity of illustration, elements illustrated in the figures are not necessarily drawn to scale. For example, the dimensions of some elements may be exaggerated relative to other elements for clarity. Further, where considered appropriate, reference numerals have been repeated among the figures to indicate corresponding or analogous elements. Reference in the specification to “one embodiment” or “an embodiment” of the invention means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment of the invention. Thus, the appearances of the phrase “in one embodiment” in various places throughout the specification are not necessarily all referring to the same embodiment.


Embodiments of the invention provide a method and apparatus for atomic frequency and voltage changes in a processor. In one embodiment of the invention, the atomic frequency and voltage changes in the processor is feasible due to the enabling technology of fully integrated voltage regulators (FIVR) that are integrated in the processor. FIVR allows independent configuration of each core in the processor and the configuration includes, but is not limited to, voltage setting, frequency setting, clock setting and other parameters that affects the power consumption of each core. The processor comprises one or more processing or logic cores in one embodiment of the invention.



FIG. 1
100 illustrates the block diagram of a processor 105 in accordance with one embodiment of the invention. The processor 105 has core 1 110 and core 2 120. The core n 130 illustrates that the processor 105 can have any number of cores. Each of the cores in the processor 105 is coupled with a voltage regulator and a clock generator. For example, in one embodiment of the invention, the core 1 110 is coupled with the voltage regulator 1 112 and the clock generator or source 1 114. The core 2 120 is coupled with the voltage regulator 2 122 and the clock generator 21 124. The voltage regulator n 132 and clock generator n 134 are coupled with the core n 130.


The voltage regulators 1 112, 2 122, and n 132 are coupled with an external voltage regulator 140 in one embodiment of the invention. The voltage regulators 1 112, 2 122, and n 132 each provide one or more separate power domains to their respective cores. The clock generators 1 114, 2 124, and n 134 each provide one or more separate clocks to their respective cores. The integration of the voltage regulators and the clock generators into the processor 105 enable each core to perform an atomic ratio change of its clock generator in one embodiment of the invention. By doing so, it improves the performance of the processor 105 and reduces the power consumption to change clock ratio of the clock generators.



FIG. 1
100 is not meant to be limiting and the processor 105 can have any number of cores in another embodiment of the invention. In one embodiment of the invention, the cores are of the same type, i.e., homogeneous cores. The cores may also be of different types, i.e., heterogeneous cores in another embodiment of the invention. The cores may also include one or more graphics processing units (GPUs) in one embodiment of the invention. In one embodiment of the invention, the cores are able to change its atomic ratio change of its clock in parallel with each other and at the same time, enter or exit different processor states (C-states). By doing so, the processor 105 can perform optimizations such as not having to change the clock ratios on active cores before waking sleeping cores.


In one embodiment of the invention, one voltage regulator can be coupled to more than one core. In another embodiment of the invention, one clock generator can be coupled to more than one core. One of ordinary skill in the relevant art will readily appreciate that the processor 105 can have different configurations without affecting the workings of the invention.



FIG. 2
200 illustrates the operations to perform atomic clock ratio change in a processor in accordance with one embodiment of the invention. For clarity of illustration, FIG. 2 is discussed with reference to FIG. 1. In step 210, the processor 105 receives a request to change the clock frequency or ratio of one or more of its cores. For ease of illustration, the core 1 110 is assumed to receive a request to change the clock frequency of one or more of the clocks provided by the clock generator or source 1 114.


In step 220, the core 1 110 stops the clock(s) that was requested to change its frequency. The clock generator 1 114 includes, but is not limited to, a phase locked loop (PLL), a delayed PLL (DLL) and other clock sources. In one embodiment of the invention, when clock generator 1 114 is a PLL or a DLL, the core 1 110 stops the clock by deactivating the reference clock to the PLL or DLL. One of ordinary skill in the relevant art will readily appreciate that there are other ways to stop the clock generator 1 114 and these other ways can be used without affecting the workings of the invention.


In step 230, the core 1 110 ramps or changes the voltage provided by the voltage regulator 1 112 to a level that supports the requested clock frequency. In one embodiment of the invention, the core 1 110 sends a voltage identification (VID) associated with the level that supports the request clock ratio or frequency to the voltage regulator 1 112. Once the voltage change in step 230 has been completed, the core 1 110 changes the clock(s) to the requested frequency in step 240. For example, in one embodiment of the invention, when clock generator 1 114 is a PLL or a DLL, the core 1 110 changes the ratio of the clock and waits for the PLL or DLL to lock to the requested frequency.


When the clock(s) has changed to the requested frequency, the core 1 110 continues its execution in step 250. The operations described for core 1 110 applies for the other cores in the processor 105 and shall not be described herein. The operations illustrated in FIG. 2 require less time to change the frequency of the clock(s) in one embodiment of the invention. This is because there is no long delay to wait for the voltage to change to the level that supports the requested frequency.



FIG. 3 illustrates a system or platform 300 to implement the methods disclosed herein in accordance with one embodiment of the invention. The system 300 includes, but is not limited to, a desktop computer, a tablet computer, a laptop computer, a netbook, a notebook computer, a personal digital assistant (PDA), a server, a workstation, a cellular telephone, a mobile computing device, a smart phone, an Internet appliance or any other type of computing device. In another embodiment, the system 300 used to implement the methods disclosed herein may be a system on a chip (SOC) system.


The processor 310 has a processing core 312 to execute instructions of the system 300. The processing core 312 includes, but is not limited to, fetch logic to fetch instructions, decode logic to decode the instructions, execution logic to execute instructions and the like. The processor 310 has a cache memory 316 to cache instructions and/or data of the system 300. In another embodiment of the invention, the cache memory 316 includes, but is not limited to, level one, level two and level three, cache memory or any other configuration of the cache memory within the processor 310.


The memory control hub (MCH) 314 performs functions that enable the processor 310 to access and communicate with a memory 330 that includes a volatile memory 332 and/or a non-volatile memory 334. The volatile memory 332 includes, but is not limited to, Synchronous Dynamic Random Access Memory (SDRAM), Dynamic Random Access Memory (DRAM), RAMBUS Dynamic Random Access Memory (RDRAM), and/or any other type of random access memory device. The non-volatile memory 334 includes, but is not limited to, NAND flash memory, phase change memory (PCM), read only memory (ROM), electrically erasable programmable read only memory (EEPROM), or any other type of non-volatile memory device.


The memory 330 stores information and instructions to be executed by the processor 310. The memory 330 may also stores temporary variables or other intermediate information while the processor 310 is executing instructions. The chipset 320 connects with the processor 310 via Point-to-Point (PtP) interfaces 317 and 322. The chipset 320 enables the processor 310 to connect to other modules in the system 300. In another embodiment of the invention, the chipset 320 is a platform controller hub (PCH). In one embodiment of the invention, the interfaces 317 and 322 operate in accordance with a PtP communication protocol such as the Intel® QuickPath Interconnect (QPI) or the like. The chipset 320 connects to a GPU or a display device 340 that includes, but is not limited to, liquid crystal display (LCD), cathode ray tube (CRT) display, or any other form of visual display device. In another embodiment of the invention, the GPU 340 is not connected to the chipset 320 and is part of the processor 310 (not shown).


In addition, the chipset 320 connects to one or more buses 350 and 360 that interconnect the various modules 374, 380, 382, 384, and 386. Buses 350 and 360 may be interconnected together via a bus bridge 372 if there is a mismatch in bus speed or communication protocol. The chipset 320 couples with, but is not limited to, a non-volatile memory 380, a mass storage device(s) 382, a keyboard/mouse 384 and a network interface 386. The mass storage device 382 includes, but is not limited to, a solid state drive, a hard disk drive, an universal serial bus flash memory drive, or any other form of computer data storage medium. The network interface 386 is implemented using any type of well known network interface standard including, but not limited to, an Ethernet interface, a universal serial bus (USB) interface, a Peripheral Component Interconnect (PCI) Express interface, a wireless interface and/or any other suitable type of interface. The wireless interface operates in accordance with, but is not limited to, the IEEE 802.11 standard and its related family, Home Plug AV (HPAV), Ultra Wide Band (UWB), Bluetooth, WiMax, or any form of wireless communication protocol.


While the modules shown in FIG. 3 are depicted as separate blocks within the system 300, the functions performed by some of these blocks may be integrated within a single semiconductor circuit or may be implemented using two or more separate integrated circuits. The system 300 may include more than one processor/processing core in another embodiment of the invention.


The methods disclosed herein can be implemented in hardware, software, firmware, or any other combination thereof. Although examples of the embodiments of the disclosed subject matter are described, one of ordinary skill in the relevant art will readily appreciate that many other methods of implementing the disclosed subject matter may alternatively be used. In the preceding description, various aspects of the disclosed subject matter have been described. For purposes of explanation, specific numbers, systems and configurations were set forth in order to provide a thorough understanding of the subject matter. However, it is apparent to one skilled in the relevant art having the benefit of this disclosure that the subject matter may be practiced without the specific details. In other instances, well-known features, components, or modules were omitted, simplified, combined, or split in order not to obscure the disclosed subject matter.


The term “is operable” used herein means that the device, system, protocol etc, is able to operate or is adapted to operate for its desired functionality when the device or system is in off-powered state. Various embodiments of the disclosed subject matter may be implemented in hardware, firmware, software, or combination thereof, and may be described by reference to or in conjunction with program code, such as instructions, functions, procedures, data structures, logic, application programs, design representations or formats for simulation, emulation, and fabrication of a design, which when accessed by a machine results in the machine performing tasks, defining abstract data types or low-level hardware contexts, or producing a result.


The techniques shown in the figures can be implemented using code and data stored and executed on one or more computing devices such as general purpose computers or computing devices. Such computing devices store and communicate (internally and with other computing devices over a network) code and data using machine-readable media, such as machine readable storage media (e.g., magnetic disks; optical disks; random access memory; read only memory; flash memory devices; phase-change memory) and machine readable communication media (e.g., electrical, optical, acoustical or other form of propagated signals—such as carrier waves, infrared signals, digital signals, etc.).


While the disclosed subject matter has been described with reference to illustrative embodiments, this description is not intended to be construed in a limiting sense. Various modifications of the illustrative embodiments, as well as other embodiments of the subject matter, which are apparent to persons skilled in the art to which the disclosed subject matter pertains are deemed to lie within the scope of the disclosed subject matter.

Claims
  • 1. A processor comprising: a plurality of voltage regulators;a plurality of clock generators; anda plurality of cores, wherein each core is coupled to one of the plurality of voltage regulators and one of the plurality of clock generators, and wherein at least one core is to perform an atomic ratio change of the corresponding clock generator responsive to a request to change a frequency of the at least one core.
  • 2. The processor of claim 1, wherein each clock generator comprises a phase locked loop or a delayed phase locked loop.
  • 3. The processor of claim 1, wherein the at least one core is to change a first clock ratio of the corresponding clock generator to a second clock ratio to accommodate the request.
  • 4. The processor of claim 3, wherein the at least one core is to: stop the corresponding clock generator in response to receipt of the request;change a voltage provided by a corresponding voltage regulator to a level that supports the frequency change; andchange the first clock ratio of the corresponding clock generator to the second clock ratio.
  • 5. The processor of claim 4, wherein the at least one core is to thereafter to determine that the corresponding clock generator is locked.
  • 6. The processor of claim 4, wherein the at least one core is to: send a voltage identification (VID) to the corresponding voltage regulator to cause the voltage change.
  • 7. The processor of claim 1, wherein the at least one core comprises: a fetch logic to fetch instructions;a decode logic to decode the fetched instructions; andan execution logic to execute the decoded instructions.
  • 8. The processor of claim 7, wherein the processor further comprises a memory controller.
  • 9. The processor of claim 1, wherein a first voltage regulator is to be associated with at least two of the plurality of cores.
  • 10. The processor of claim 1, wherein a first clock generator is to be associated with at least two of the plurality of cores.
  • 11. The processor of claim 1, wherein the plurality of cores includes at least some homogeneous cores and at least some heterogeneous cores.
  • 12. A system comprising: a memory;a processor comprising: a plurality of voltage regulators;a plurality of clock generators; anda plurality of cores, wherein each of the plurality of cores is coupled to one of the plurality of voltage regulators and one of the plurality of clock generators, and wherein at least one core is to perform an atomic ratio change of the corresponding clock generator responsive to a request to change a frequency of the at least one core; anda first voltage regulator coupled to the processor.
  • 13. The system of claim 12, wherein the at least one core is to receive the request to change a first clock ratio of the corresponding clock generator to a second clock ratio.
  • 14. The system of claim 13, wherein the at least one core is to stop the corresponding clock generator in response to receipt of the request, change a voltage provided by a corresponding voltage regulator to a level that supports the frequency change by provision of a voltage identification (VID) to the corresponding voltage regulator, and change the first clock ratio of the corresponding clock generator to the second clock ratio.
  • 15. The system of claim 12, wherein each of the plurality of voltage regulators is to provide a separate power domain to a corresponding one of the plurality of cores.
  • 16. The system of claim 12, wherein a first voltage regulator is to be associated with at least two of the plurality of cores and a first clock generator is to be associated with at least two of the plurality of cores.
  • 17. A non-transitory machine-readable medium having stored thereon instructions, which if performed by a machine cause the machine to perform a method comprising: performing, by one of a plurality of cores of a multicore processor, each core coupled to a voltage regulator and a clock generator of the multicore processor, an atomic ratio change of the corresponding clock generator from a first clock ratio to a second clock ratio responsive to a request for a frequency change, the multicore processor including a plurality of voltage regulators and a plurality of clock generators.
  • 18. The non-transitory machine-readable medium of claim 17, wherein the method further comprises: stopping the corresponding clock generator in response to receipt of the request;changing a voltage provided by a corresponding voltage regulator to a level that supports the frequency change, and changing the first clock ratio of the corresponding clock generator to the second clock ratio.
  • 19. The non-transitory machine-readable medium of claim 18, wherein the method further comprises determining that the corresponding clock generator is locked.
  • 20. The non-transitory machine-readable medium of claim 18, wherein the method further comprises: sending a voltage identification (VID) to the corresponding voltage regulator to cause the voltage change.
US Referenced Citations (76)
Number Name Date Kind
5163153 Cole et al. Nov 1992 A
5522087 Hsiang May 1996 A
5590341 Matter Dec 1996 A
5621250 Kim Apr 1997 A
5931950 Hsu Aug 1999 A
6748546 Mirov et al. Jun 2004 B1
6792392 Knight Sep 2004 B1
6823516 Cooper Nov 2004 B1
6829713 Cooper et al. Dec 2004 B2
6996728 Singh Feb 2006 B2
7010708 Ma Mar 2006 B2
7043649 Terrell May 2006 B2
7093147 Farkas et al. Aug 2006 B2
7111179 Girson et al. Sep 2006 B1
7194643 Gonzalez et al. Mar 2007 B2
7263457 White et al. Aug 2007 B2
7272730 Acquaviva et al. Sep 2007 B1
7412615 Yokota et al. Aug 2008 B2
7434073 Magklis Oct 2008 B2
7437270 Song et al. Oct 2008 B2
7454632 Kardach et al. Nov 2008 B2
7529956 Stufflebeam May 2009 B2
7539885 Ma May 2009 B2
7730340 Hu et al. Jun 2010 B2
8010817 Chu et al. Aug 2011 B2
8912830 Conrad Dec 2014 B2
20010044909 Oh et al. Nov 2001 A1
20020194509 Plante et al. Dec 2002 A1
20030061383 Zilka Mar 2003 A1
20040064752 Kazachinsky et al. Apr 2004 A1
20040098560 Storvik et al. May 2004 A1
20040139356 Ma Jul 2004 A1
20040268166 Farkas et al. Dec 2004 A1
20050022038 Kaushik et al. Jan 2005 A1
20050033881 Yao Feb 2005 A1
20050132238 Nanja Jun 2005 A1
20060050670 Hillyard et al. Mar 2006 A1
20060053326 Naveh Mar 2006 A1
20060059286 Bertone et al. Mar 2006 A1
20060069936 Lint et al. Mar 2006 A1
20060117202 Magklis et al. Jun 2006 A1
20060184287 Belady et al. Aug 2006 A1
20060282692 Oh Dec 2006 A1
20070005995 Kardach et al. Jan 2007 A1
20070016817 Albonesi et al. Jan 2007 A1
20070079294 Knight Apr 2007 A1
20070106827 Boatright et al. May 2007 A1
20070156992 Jahagirdar Jul 2007 A1
20070214342 Newburn Sep 2007 A1
20070239398 Song et al. Oct 2007 A1
20070245163 Lu et al. Oct 2007 A1
20080028240 Arai et al. Jan 2008 A1
20080250260 Tomita Oct 2008 A1
20090006871 Liu et al. Jan 2009 A1
20090150695 Song et al. Jun 2009 A1
20090150696 Song et al. Jun 2009 A1
20090158061 Schmitz et al. Jun 2009 A1
20090158067 Bodas et al. Jun 2009 A1
20090172375 Rotem et al. Jul 2009 A1
20090172428 Lee Jul 2009 A1
20090235105 Branover et al. Sep 2009 A1
20100115309 Carvalho et al. May 2010 A1
20100146513 Song Jun 2010 A1
20100191997 Dodeja et al. Jul 2010 A1
20110154090 Dixon et al. Jun 2011 A1
20120042176 Kim Feb 2012 A1
20120079290 Kumar Mar 2012 A1
20120246506 Knight Sep 2012 A1
20130061064 Ananthakrishnan et al. Mar 2013 A1
20130080803 Ananthakrishnan et al. Mar 2013 A1
20130080804 Ananthakrishnan et al. Mar 2013 A1
20130111120 Ananthakrishnan et al. May 2013 A1
20130111121 Ananthakrishnan et al. May 2013 A1
20130111226 Ananthakrishnan et al. May 2013 A1
20130111236 Ananthakrishnan et al. May 2013 A1
20140068290 Bhandaru et al. Mar 2014 A1
Foreign Referenced Citations (1)
Number Date Country
1 282 030 May 2003 EP
Non-Patent Literature Citations (20)
Entry
Intel Developer Forum, IDF2010, Opher Kahn, et al., “Intel Next Generation Microarchitecture Codename Sandy Bridge: New Processor Innovations,” Sep. 13, 2010, 58 pages
Spec-Power and Performance, Design Overview V1.10, Standard Performance Information Corp., Oct. 21, 2008, 6 pages.
Intel Technology Journal, “Power and Thermal Management in the Intel Core Duo Processor,” May 15, 2006, pp. 109-122.
Anoop Iyer, et al., “Power and Performance Evaluation of Globally Asynchronous Locally Synchronous Processors,” 2002, pp. 1-11.
Greg Semeraro, et al., “Hiding Synchronization Delays In A GALS Processor Microarchitecture,” 2004, pp. 1-13.
Joan-Manuel Parc Erisa, et al., “Efficient Interconnects for Clustered Microarchitectures,” 2002, pp. 1-10.
Grigorios Magklis, et al., “Profile-Based Dynamic Voltage and Frequency Stalling for a Multiple Clock Domain Microprocessor,” 2003, pp. 1-12.
Greg Semeraro, et al., “Dynamic Frequency and Voltage Control for a Multiple Clock Domain Architecture,” 2002, pp. 1-12.
Greg Semeraro, “Energy-Efficient Processor Design Using Multiple Clock Domains with Dynamic Voltage and Frequency Scaling,” 2002, pp. 29-40.
Diana Marculescu, “Application Adaptive Energy Efficient Clustered Architectures,” 2004, pp. 344-349.
L. Benini, et al., “System-Level Dynamic Power Management,” 1999, pp. 23-31.
Ravindra Jejurikar, et al., “Leakage Aware Dynamic Voltage Scaling For Real-Time Embedded Systems,” 2004, pp. 275-280.
Ravindra Jejurikar, et al., “Dynamic Slack Reclamation With Procrastination Scheduling In Real-Time Embedded Systems,” 2005, pp. 13-17.
R. Todling, et al., “Some Strategies For Kalman Filtering and Smoothing,” 1996, pp. 1-21.
R.E. Kalman, “A New Approach to Linear Filtering and Prediction Problems,” 1960, pp. 1-12.
International Application No. PCT/US2012/028865, filed Mar. 13, 2012, entitled “Providing Efficient Turbo Operation of a Processor,” by Intel Corporation.
International Application No. PCT/US2012/028902, filed Mar. 13, 2012, entitled “Dynamically Controlling Interconnect Frequency in a Processor,” by Intel Corporation.
International Application No. PCT/US2012/028876, filed Mar. 13, 2012, entitled “Dynamically Computing an Electrical Design Point (EDP) for a Multicore Processor,” by Intel Corporation.
Intel Corporation, “Intel 64 and IA-32 Architectures Software Developer's Manual,” vol. 3 (3A, 3B & 3C): System Programming Guide, Feb. 2014, Chapter 14 Power and Thermal Management (14.1-14.9.5), 44 pages.
International Search Report and Written Opinion for PCT Patent Application No. PCT/US2012/030995 Mailed Oct. 31, 2012, 9 Pages.
Related Publications (1)
Number Date Country
20150134999 A1 May 2015 US
Continuations (1)
Number Date Country
Parent 13976693 Jun 2013 US
Child 14539835 US