Processors, such as central processing units (CPUs), graphics processing units (GPUs), and other computing devices, generate heat when executing various software instructions. In the configurations of processors in a stacked relationship with memories, such as dynamic random-access memories (DRAMs), the processor generated heat will affect performance of the DRAMs. Since stacked DRAMs are volatile memories, they require frequent refreshing storage banks, also referred to storage arrays. The refresh rate is related with temperatures of the storage banks. In general, the higher temperature the storage bank is at, the higher the refresh rate is of the storage bank. When the DRAM banks are getting refreshed, accessing the data in those banks is delayed such that the performance of processor(s) is impacted.
The embodiments will be more readily understood in view of the following description when accompanied by the below figures and wherein like reference numerals represent like elements, wherein:
Methods and apparatus leverage software hints indicating future processor usage to manage data in memory. In the stacked memory-processor architectures, the heat generated by processor(s) changes the temperature associated with the memory and affects the performance of the processor(s). In some variations, the apparatus generates thermal gradient prediction associated with the stacked architecture based at least in part on the software hints and manages data in the stacked memory based at least in part on the thermal gradient prediction.
In certain implementations, a method for managing data in one or more memories, includes receiving at least one hint related to future processor usage of a software component, such as a software application, where the future processor usage is indicative of future usage of the one or more processors when executing at least part of the software component. In some instances, the method includes selecting a memory location in the one or more memories for data used by the software component based on the hint and allocates the data used by the software component at the selected memory location in the one or more memories.
In some examples, the method includes generating a thermal gradient predication for the one or more memories based at least in part on the hint, where selecting a memory location includes selecting the memory location based on the thermal gradient prediction of the one or more memories. In some examples, the method includes receiving temperature data associated with the one or more memories, where the temperature data is collected by a plurality of temperature sensors. In some instances, the method includes generating the thermal gradient prediction for the one or more memories based at least in part on the hint and the received temperature data. In some examples, the thermal gradient prediction for the one or more memories includes a spatial map of the multiple memory layers indicating temperatures, temperature differences, and/or other temperature information at various three-dimensional physical positions of the multiple memory layers.
In some implementations, the one or more memories include DRAMs. In some implementations, the one or more memories are disposed in a stacked relation with respect to the one or more processors. In certain examples, the method includes selecting one or more memory locations in the one or more memories for the software component prior to execution of the at least part of the software component, where the selected one or more memory locations are predicted to be at a lower temperature than other memory locations of the one or more memories. In certain instances, the example includes migrating the data used by the software component from a first memory location in the one or more memories to the selected memory location in the one or more memories for the software component prior to execution of the at least part of the software component, where the selected memory location is predicted to be at a lower temperature than the first memory location of the one or more memories.
In some examples, the software component comprises an application that includes executable code that includes the hint, where the software component is configured to write the hint to a register. In some instances, the method includes receiving the hint from the register, where the hint is generated based upon an analysis of the software component.
In some instances, the method includes determining temperature information of the one or more memories when the software component is being executed by the one or more processors; and migrating the data used by the software component from a third memory location in the one or more memories to a fourth memory location in the one or more memories based on the determined temperature information. The fourth memory location is different from the third memory location.
In some implementations, the hint includes a hint indicative of a processor priority of the at least part of the software component. In some instances, the hint comprises a hint indicative of a first processor of the one or more processors executing the software component concurrently with a second processor of the one or more processors executing the software component, where the second processor is different from the first processor.
In certain implementations, an apparatus includes one or more processors and a memory allocation logic that receives at least one hint related future processor usage of a software component. In some implementations, the future processor usage is indicative of future usage of the one or more processors when executing at least part of the software component. The memory allocation logic selects a memory location in the one or more memories for data used by the software component based on the hint and allocates the data used by the software component at the selected memory location in the one or memories.
In some examples, the memory allocation logic generates a thermal gradient predication for the one or more memories based at least in part on the hint. In some instances, the memory allocation logic selects the memory location based on the thermal gradient prediction of the one or more memories.
In some implementations, the memory allocation logic receives temperature data associated with the one or more memories, the temperature data collected by a plurality of temperature sensors and generates the thermal gradient prediction for the one or more memories based at least in part on the hint and the received temperature data. In some examples, the memory allocation logic selects one or more memory locations in the one or more memories for the software component prior to execution of the at least part of the software component, where the selected one or more memory locations are predicted to be at a lower temperature than other memory locations of the one or more memories.
In certain implementations, the memory allocation logic migrates the data used by the software component from a first memory location in the one or more memories to the selected memory location in the one or more memories for the software component prior to execution of the at least part of the software component, where the selected memory location is predicted to be at a lower temperature than the first memory location of the one or more memories. In some examples, the software component includes an application that includes executable code that comprises the hint and is configured to write the hint to the register. In some instances, the memory allocation logic receives the hint from a register, where the hint is generated based upon an analysis of the software component.
In some implementations, the memory allocation logic determines temperature information of the one or more memories when the software component is executed by the one or more processors. In some instances, the memory allocation logic migrates the data used by the software component from a third memory location in the one or more memories to a fourth memory location in the one or more memories based on the determined temperature information, where the fourth memory location is different from the third memory location. In some examples, the hint comprises a hint indicative of a processor priority of the at least part of the software component.
In certain implementations, a method for managing data in one or more memories disposed in a stack relation with respect to one or more processors, includes receiving at least one hint indicating future processor usage of a software component, the future processor usage indicative of future usage of the one or more processors when executing at least part of the software component. In some instances, the method includes receiving temperature data associated with the one or more memories, the temperature data collected by a plurality of temperature sensors. In some instances, the method includes generating the thermal gradient prediction for the one or more memories based at least in part on the hint and the received temperature data. In some instances, the method includes allocating one or more memory locations in the one or more memories for the software component prior to execution of the at least part of the software component based on the thermal gradient prediction of the one or more memories, where the allocated one or more memory locations are predicted to be at a lower temperature than other memory locations of the one or more memories.
In some examples, the computing device 100 includes one or more memories 135, such as DRAMs 110, a memory allocation logic 115, one or more processors (e.g., central processing unit (CPU), graphics processing unit (GPU), general purpose GPU (GPGPU), accelerated processing unit (APU), and/or compute unit (CU)) 130, register(s) 140, memory controller(s) 145, a power manager 150, and temperature sensor(s) 160. Any number of additional components, different components, and/or combinations of components is also included in the computing device 100. One or more of the components are optional to the computing device 100.
In some implementations, the computing device 100 includes one or more address buses and/or data buses that, directly and/or indirectly, couple various components of the computing device 100. In some designs, any number of the components of computing device 100, or combinations thereof, may be distributed and/or duplicated across a number of computing devices. In some variations, the computing device 100 includes any number of processors (e.g., CPUs, GPUs, etc.) 130. For example, in one variation, the computing device 100 includes one CPU. In other variations, the computing device 100 includes two or five CPUs. For example, in one variation, the computing device 100 includes one GPU. In other variations, the computing device 100 includes ten or fifteen GPUs.
In some implementations, an application 120, which includes executable instructions stored in memory(s) 135, is loaded on the one or more processors 130 to be executed by the one or more processors 130. As used herein, a processor refers to one or more CPUs, GPUs, GPGPUs, APUs, and/or other processing units. In some variations, the application 120 is also referred to as a software component, which includes a plurality of software instructions to be executed by a processor. In some variations, the software instructions include instructions in a high-level programming language, which is also referred to as user-level application code. In some variations, the software instructions include computer/machine readable code, or referred to as compiled code. In some variations, the application/software component 120 refers to both the user-level application code and the computer/machine readable code or any other suitable level of code.
In some implementations, the application 120 writes one or more hints to the register(s) 140 (e.g., model specific register (“MSR”)), for example, when the application 120 is loaded onto the one or more processors 130. The application 120 includes one or more code sections. In some implementations, the application 120 writes one or more hints to the register(s) 140 when the application or a code section of the application is executed by the one or more processors 130. In one example, the application 120 writes a hint indicating future processor usage of a code section of the application 120 when the application 120 is executed by the one or more processors 130. In some variations, the application 120 (e.g., the user-level application code, compiled code) includes the one or more hints. In some variations, the user-level application code includes the one or more hints. In one example, the one or more hints in the user-level application code are written by a software developer. In some variations, the one or more hints are generated by a compiler when compiling the application 120.
In some implementations, the one or more hints include a hint indicative of future processor usage of the application 120. In some implementations, the one or more hints include a hint indicative of a priority of the future processor usage of the application 120. In some implementations, the one or more hints include a hint indicative of future processor usage of a code section of the application 120. In some implementations, the one or more hints include a hint indicative of a priority of the future processor usage of the code section of the application 120. As used herein, a priority of processor usage is a relative priority value with respect to other applications/software components. In some implementations, other software hints (e.g., processor intensity, etc.) are used for managing data in the stacked DRAMs 110.
In some implementations, the one or more hints include a CPU usage hint indicative of future CPU usage of the application 120. In some implementations, the one or more hints include a CPU priority hint indicative of a priority of the future CPU usage of the application 120. In some implementations, the one or more hints include a CPU usage hint indicative of future CPU usage of a code section of the application 120. In some implementations, the one or more hints include a CPU priority hint indicative of a priority of the future CPU usage of the code section of the application 120.
In some implementations, the one or more hints include a GPU usage hint indicative of future GPU usage of the application 120. In some implementations, the one or more hints include a GPU priority hint indicative of a priority of the future GPU usage of the application 120. In some implementations, the one or more hints include a GPU usage hint indicative of future GPU usage of a code section of the application 120. In some implementations, the one or more hints include a GPU priority hint indicative of a priority of the future GPU usage of the code section of the application 120.
In some implementations, the one or more hints include a CPU/GPU usage hint indicative of future CPU and/or GPU usage of the application 120. In some implementations, the one or more hints include a CPU/GPU priority hint indicative of a priority of the future CPU and/or GPU usage of the application 120. In some implementations, the one or more hints include a CPU/GPU usage hint indicative of future CPU and/or GPU usage of a code section of the application 120. In some implementations, the one or more hints include a CPU/GPU priority hint indicative of a priority of the future CPU and/or GPU usage of the code section of the application 120. In some implementations, the application 120 includes executable code that includes the one or more hints and writes the one or more hints to the register(s) 140.
In some implementations, the one or more hints include a hint related to a first processor of the one or more processors 130 executing the application 120 concurrently with a second processor of the one or more processors 130 executing the application 120. In some variations, the one or more hints include a hint indicative of a future processor usage of both the first processor and the second processor. In some variations, the one or more hints include a hint indicative of a priority of the future process usage of both the first processor and the second processor.
In some implementations, the monitor program 125, an optional component, is running on the one or more processors 130 to monitor processor usage of the application 120. In some variations, the monitor program 125 predicts future processor usage of the application 120, generates the one or more hints and writes the one or more hints to the registers 140. In some variations, the one or more hints are generated based upon an analysis of the application 120. In some implementations, the memory allocation logic 115 manages data used by the application 120 (e.g., selects memory location(s) for data, allocates data, migrates data, etc.) and other applications based on the one or more hints. In some variations, data used by an application/software component includes input data, intermediately generated data, and output data of the application/software component. In some examples, the memory allocation logic 115 is implemented by the memory controller 145, firmware of the micro controller, the one or more processors 130, and/or the like.
In some implementations, the memory allocation logic 115 generates a thermal gradient predication for the stacked DRAMs 110 based at least in part on the one or more hints. In one example, the memory allocation logic 115 generates the thermal gradient prediction for the stacked DRAMs 110 based on a hint of future processor usage and the floorplan information of the stacked DRAMs and processors. In some examples, the thermal gradient prediction for the one or more memories includes data indicating temperatures, temperature differences, and/or other temperature information at various three-dimensional physical positions of the stacked memories. In some examples, the thermal gradient prediction for the one or more memories includes a spatial map of multiple memory layers indicating temperatures, temperature differences, and/or other temperature information at various three-dimensional physical positions of the multiple memory layers.
In some implementations, floorplan information includes thermal resistance and capacitance of each of the silicon layers, the thickness of each of the layers, the location of heat sink, and other related information. In some variations, the memory allocation logic 115 manages the memory location for data used by the application 120 based at least in part on the thermal gradient prediction of the stacked DRAMs. In some implementations, the memory allocation logic 115 generates the thermal gradient prediction for the stacked DRAMs 110 based on temperature data associated with the stacked DRAMs 110. In some variations, the temperature data is collected by the temperature sensors 160 and received by the memory allocation logic 115.
In some implementations, the computing device 100 includes one or more temperature sensors 160. Each temperature sensor 160 detects and/or provides temperature readings or feedback to various components of the computing device 100. The temperature sensor 160 can be any sensor or transducer, such as an on-die temperature sensor, which detects temperature. In some variations, the one or more temperature sensors 160 are disposed at various location in the stacked memory-processor architecture (e.g., the memory-processor architecture 200 in
In certain implementations, the memory allocation logic 115 receives the one or more software hints. As used herein, “receive” or “receiving” includes obtaining data from a register or other data source, retrieving data from a data repository, receiving data from a communication link, and/or the like. In some implementations, the memory allocation logic 115 allocates one or more memory locations in the stacked DRAMs for the application 120 prior to execution of the code section of the software component, where the allocated one or more memory locations are predicted to be at a lower temperature than other memory locations of the stacked DRAMs.
In some implementations, the memory allocation logic 115 migrates the data used by the software component from a first memory location in the stacked DRAMs to a second memory location in the stacked DRAMs for the software component prior to execution of the code section of the software component, where the second memory location is predicted to be at a lower temperature than other memory locations of the stacked DRAMs. In some implementations, the memory allocation logic 115 receives the one or more hints including one or more executable instructions (e.g., malloc, load, store, read, write, etc.).
In some implementations, the memory allocation logic 115 predicts temperature information of the stacked DRAMs when the application 120 or a code section of the application 120 is being executed by the one or more processors 130 and migrates the data used by the application 120 from a first memory location in the stacked DRAMs 110 to a second memory location in the stacked DRAMs 110 based on the predicted temperature information and/or thermal gradient prediction for the stacked DRAMs 110, where the second memory location is different from the first memory location. In some examples, the memory allocation logic 115 migrates the frequently accessed data to memory locations with longer retention times, such as memory locations having predicted lower temperature than some other memory locations.
In some implementations, the memory allocation logic 115 determines current temperature information of the stacked DRAMs when the application 120 is being executed by the one or more processors 130 and migrates the data used by the application 120 from a first memory location in the stacked DRAMs to a second memory location in the stacked DRAMs based on the current temperature information, where the second memory location is different from the first memory location. In some examples, the memory allocation logic 115 migrates the frequently accessed data to memory locations with longer retention times, such as memory locations having lower temperature currently than some other memory locations.
In some instances, the memory allocation logic 115 monitors and predicts localized temperatures, and thereby localized refresh rates, within the stacked DRAM to understand and take advantage of a respective actual or required refresh rate of each location/region of memory. According to certain embodiments, DRAM retention time variations are exposed to a hardware component (e.g., a memory controller 145) or to a system software component (e.g., an operating system (OS) or a hypervisor). The hardware or software component performs a retention-aware data placement thereby improving memory access performance and reducing the chance for memory access collisions. Using this approach, refresh rate changes are detected, and data are moved to a new location based on the detected refresh rate changes.
In some implementations, the memory allocation logic 115 coordinates with the memory controller(s) 145 to allocate and/or migrate data used by the application 120 or a code section of the application 120. In some examples, a memory controller 145 controls memory access to (e.g., sending read requests, sending write requests, etc.) the stacked DRAMs 110. In some examples, the computing device 100 includes a plurality of memory controllers 145. In some variations, a memory controller 145 controls a portion of the stacked DRAMs 110. In some other variations, a memory controller 145 controls multiple stacked DRAMs 110.
In some implementations, the memory allocation logic 115 coordinates with the power manager 150 (e.g., dynamic voltage frequency setting (“DVFS”) control, firmware power management, etc.) to manage data for stacked DRAMs. In some instances, the DVFS control modulates the clock frequencies of the one or more the processors 130 to manage the power consumed by one or more the processors 130. In some implementations, various memory allocation embodiments are used in stacked DRAMs and processor architectures. In some implementations, various memory allocation embodiments are used in other stacked memory-processor architectures.
In some implementations, the present disclosure provides a solution using hints of future processor usage to predict temperature and/or thermal gradient to select a memory location to allocate or migrate data for a software application. Such solution is a proactive solution to select memory location in comparison with systems of migrating data in memory based upon current temperature or temperature gradient. In some implementations, the proactive solution using hints of future processor usage can improve effectiveness and efficiency in managing data in memory, for example, by providing better effectiveness and efficiency in managing data in memory than the reactive solution of migrating data based upon current temperature or temperature gradient.
In some implementations, the software component writes one or more hints to the register(s) (e.g., model specific register (“MSR”)), for example, when the software component is loaded onto one or more processors. The software component includes one or more code sections. In some implementations, the software component writes one or more hints to the register(s) when it is executed by the one or more processors. In one example, the software component writes a hint indicating future processor usage of a code section of the software component when the software component is executed by the one or more processors. In some variations, the software component (e.g., the user-level application code, compiled code) includes the one or more hints. In some variations, the user-level application code includes the one or more hints. In one example, the one or more hints in the user-level application code are written by a software developer. In some variations, the one or more hints are generated by a compiler when compiling the software component. In some other variations, the one or more hints are generated by a monitor program (e.g., the monitor program 125 in
In some implementations, the one or more hints include a hint indicative of future processor usage of the software component. In some implementations, the one or more hints include a hint indicative of a priority of the future processor usage of the software component. In some implementations, the one or more hints include a hint indicative of future processor usage of a code section of the software component. In some implementations, the one or more hints include a hint indicative of a priority of the future processor usage of the code section of the software component. As used herein, a priority of processor usage is a relative priority value with respect to other applications/software components.
In some implementations, the one or more hints include a CPU usage hint indicative of future CPU usage of the software component. In some implementations, the one or more hints include a CPU priority hint indicative of a priority of the future CPU usage of the software component. In some implementations, the one or more hints include a CPU usage hint indicative of future CPU usage of a code section of the software component. In some implementations, the one or more hints include a CPU priority hint indicative of a priority of the future CPU usage of the code section of the software component.
In some implementations, the one or more hints include a GPU usage hint indicative of future GPU usage of the software component. In some implementations, the one or more hints include a GPU usage hint indicative of a priority of the future GPU usage of the software component. In some implementations, the one or more hints include a GPU priority hint indicative of future GPU usage of a code section of the software component. In some implementations, the one or more hints include a GPU priority hint indicative of a priority of the future GPU usage of the code section of the software component.
In some implementations, the one or more hints include a CPU/GPU usage hint indicative of future CPU and/or GPU usage of the software component. In some implementations, the one or more hints include a CPU/GPU usage hint indicative of a priority of the future CPU and/or GPU usage of the software component. In some implementations, the one or more hints include a CPU/GPU priority hint indicative of future CPU and/or GPU usage of a code section of the software component. In some implementations, the one or more hints include a CPU/GPU priority hint indicative of a priority of the future CPU and/or GPU usage of the code section of the software component. In some implementations, the software component includes executable code that includes the one or more hints and writes the one or more hints to a hardware register.
In some implementations, the one or more hints include a hint related to a first processor executing the software component concurrently with a second processor executing the software component. In some variations, the one or more hints include a hint indicative of a future processor usage of both the first processor and the second processor. In some variations, the one or more hints include a hint indicative of a priority of the future process usage of both the first processor and the second processor.
In some implementations, a monitor program is running on the one or more processors to monitor processor usage and application behavior of the software component. In some variations, the monitor program predicts future processor usage of the software component and generates the one or more hints and writes the one or more hints to the register(s). In some variations, the one or more hints are generated based upon an analysis of the software component. In some implementations, the memory allocation logic manages data used by the software component (e.g., selects memory location(s), allocates data, migrates data, etc.) and other applications based on the one or more hints. In some examples, the memory allocation logic is implemented by memory controller(s) (e.g., memory controller 145 in
In some implementations, the memory allocation logic receives temperature data associated with stacked DRAMs (315). In some variations, the temperature data includes temperature data associated with processors. In some implementations, the temperature data is collected by one or more temperature sensors disposed at various locations of the stacked memory-processor architecture. Each temperature sensor detects and/or provides temperature readings or feedback to the memory allocation logic. The temperature sensor(s) can be any sensor or transducer, such as an on-die temperature sensor, which detects temperature.
In some implementations, the memory allocation logic generates a thermal gradient predication for the stacked DRAMs (320). In some variations, the thermal gradient prediction is generated based at least in part on the hints. In some variations, the memory allocation logic generates the thermal gradient prediction for the stacked DRAMs based at least in part on the at one hint and the received temperature data. In one example, the memory allocation logic generates the thermal gradient prediction for the stacked DRAMs based on a hint of future processor usage and the floorplan information of the stacked DRAMs and processors.
In some variations, the memory allocation logic manages the memory location for data used by the software component (325) (e.g., selects memory location(s) for data, allocates data, migrates data, etc.) based at least in part on the one or more hints. In some instances, the memory allocation logic selects the memory location for data used by the software component based at least in part on the thermal gradient prediction of the stacked DRAMs. In one instance, the memory allocation logic allocates the data used by the software component at the selected memory location in one or more memory locations in the stacked DRAMs for the software component when it is loaded onto the one or more processors. In one instance, the memory allocation logic migrates the data used by the software component from a current memory location to the selected memory location in the stack stacked DRAMs for the software component based on hints indicative of future processor usage and thermal gradient prediction when the software component is being executed.
In some implementations, the memory allocation logic selects one or more memory locations in the stacked DRAMs for the software component prior to execution of the code section of the software component, where the selected one or more memory locations are predicted to be at a lower temperature than other memory locations of the stacked DRAMs. In some implementations, the memory allocation logic allocates data used by the software component at the selected one or more memory locations prior to execution of the code section of the software component. In some implementations, the memory allocation logic migrates the data used by the software component from a first memory location in the stacked DRAMs to a second memory location in the stacked DRAMs for the software component prior to execution of the code section of the software component, where the second memory location is predicted to be at a lower temperature than other memory locations of the stacked DRAMs. In some implementations, the memory allocation logic receives the one or more hints including one or more executable instructions (e.g., malloc, load, store, read, write, etc.).
In some implementations, the memory allocation logic determines temperature information of the stacked DRAMs during the software component is executed by one or more processors, and migrates the data used by the software component from a first memory location in the stacked DRAMs to a second memory location in the stacked DRAMs based on the determined temperature information, where the second memory location is different from the first memory location.
Referring back to
In some implementations, the computing device predicts spatial thermal gradient map of the 3D stack (430) based on the processor expected utilization. In some implementations, the 3D stack refers to the stacked memory-processor architecture. In some implementations, the spatial temperature map is predicted based on current temperature data. More details on spatial thermal gradient map are provided below.
In some implementations, the computing device checks whether the predicted temperature at a relevant memory location is greater than a threshold (435). In some variations, the relevant location is the memory location(s) of data used by the specific code section and/or the software component. In some variations, the relevant location includes any location of the memory. In some implementations, the threshold is a predetermined threshold. In some implementations, the threshold is adjusted by the computing device. If the predicted temperature is greater than the threshold, the computing device migrates or allocates data based on temperature prediction (440), for example, using the predicted spatial thermal gradient map. If the predicted temperature is not greater than the threshold, the computing device goes back to read the next software hint.
In some implementations, the loop in method 400 is executed periodically. In some variations, the loop is executed when a new hint is received or written to the register.
In the hardware-only based method, the processor is agnostic to the re-mapping. The memory controller gets the physical address requested by the CPU or GPU. It translates this physical address further based on the corresponding new address range.
Power(x,y,z)=α(x,y,z)×f3(x,y,z) (1),
where Power(x, y, z) is the predicted power usage at the processor layer location (x, y, z), α(x, y, z) is the processor utilization information at the processor layer location, and f(x, y, z) is the processor frequency at the processor layer location.
In some implementations, the computing device determines future power usage by a memory bank (e.g., DRAMs bank) at a memory location (630) using floorplan of the 3D stack (625). In some implementations, floorplan information includes thermal resistance and capacitance of each of the silicon layers, the thickness of each of the layers, the location of heat sink, and other related information. In one implementation, the current power usage is used as an estimation for future power consumed. Power(x,y,z), which is power consumed by a memory bank at a location (x, y, z), is determined by multiplying the supplied voltage and measured current. In one example, the current is measured via current sense resistors. In some variations, the computing device predicts temperature at the location (640), associated with the 3D stack based on the power usage of the processor(s) and/or memories.
In one implementation, the temperature is predicted using equation (2) below:
TempFuture(x,y,z)=M(x,y,z)×Power(x,y,z)+TempCurrent(x,y,z) (2),
where TempFuture (x, y, z) is the predicted temperature at location (x, y, z), M(x, y, z) includes the thermal resistance and capacitance at location (x, y, z), Power(x, y, z) is the predicted power usage at the 3D stack location (x, y, z), and TempCurrent (x, y, z) is the predicted temperature at location (x, y, z). In some instances, the computing device generates the thermal gradient prediction for a stacked architecture based on the predicted temperature data. In one example, the thermal gradient prediction includes a ratio of temperatures difference and physical location distance at various physical locations in the stacked architecture.
In the example illustrated in
In the example illustrated in
In some implementations, the monitor program (720C) uses machine learning model to predict processor usage based upon historical data on processor usage and the monitored application behavior. In one example, the monitor program uses a linear regression algorithm to predict future processor usage. In one instance, the monitor program uses a non-linear regression algorithm to predict future processor usage. In some implementations, the machine learning model includes any suitable machine learning models, deep learning models, and/or the like. In some instances, the machine learning model includes at least one of a decision tree, random forest, support vector machine, convolutional neural network, recurrent neural network, and/or the like. In some instances, the future processor usage is determined based on parameters that can be measured such as, for example, processor frequency, power usage, last level cache misses, and/or the like.
Although features and elements are described above in particular combinations, each feature or element can be used alone without the other features and elements or in various combinations with or without other features and elements. The apparatus described herein in some implementations are manufactured by using a computer program, software, or firmware incorporated in a non-transitory computer-readable storage medium for execution by a general-purpose computer or a processor. Examples of computer-readable storage mediums include a read only memory (ROM), a random-access memory (RAM), a register, cache memory, semiconductor memory devices, magnetic media such as internal hard disks and removable disks, magneto-optical media, and optical media such as CD-ROM disks, and digital versatile disks (DVDs).
In the preceding detailed description of the various embodiments, reference has been made to the accompanying drawings which form a part thereof, and in which is shown by way of illustration specific preferred embodiments in which the invention may be practiced. These embodiments are described in sufficient detail to enable those skilled in the art to practice the invention, and it is to be understood that other embodiments may be utilized, and that logical, mechanical and electrical changes may be made without departing from the scope of the invention. To avoid detail not necessary to enable those skilled in the art to practice the invention, the description may omit certain information known to those skilled in the art. Furthermore, many other varied embodiments that incorporate the teachings of the disclosure may be easily constructed by those skilled in the art. Accordingly, the present invention is not intended to be limited to the specific form set forth herein, but on the contrary, it is intended to cover such alternatives, modifications, and equivalents, as can be reasonably included within the scope of the invention. The preceding detailed description is, therefore, not to be taken in a limiting sense, and the scope of the present invention is defined only by the appended claims. The above detailed description of the embodiments and the examples described therein have been presented for the purposes of illustration and description only and not by limitation. For example, the operations described are done in any suitable order or manner. It is therefore contemplated that the present invention covers any and all modifications, variations or equivalents that fall within the scope of the basic underlying principles disclosed above and claimed herein.
The above detailed description and the examples described therein have been presented for the purposes of illustration and description only and not for limitation.