As the complexity of applications increases, the requirements for computing power of processing units also increase. According to the current technology, for a multi-core system, as long as the utilization is the same, the same power will be used for power supply.
However, as the operation scenarios become more and more complex, the power saving performance of this power supply method is highly limited. A better solution for reducing power consumption is still in need.
An embodiment provides a power consumption reduction method. The method can include defining y operation scenarios according to x types of extracted information, generating z power profiles each used for controlling power provided to a subset of a plurality of processors, assigning the z power profiles to the y operation scenarios in a machine learning model, collecting to-be-evaluated information by the plurality of processors, comparing the to-be-evaluated information with the x types of extracted information to find a most similar type of extracted information, using the machine learning model to select an optimal power profile from the z power profiles according to the most similar type of extracted information, and applying the optimal power profile to control the power provided to the subset of the plurality of processors, where x, y and z are integers larger than zero, and the subset of the plurality of processors are of a same type of processor.
Another embodiment provides a power consumption reduction system including a plurality of processors and a machine learning model. The plurality of processors can be used to run a plurality of applications and collect to-be-evaluated information. The machine learning model can have z power profiles corresponding to y operation scenarios according to x types of extracted information. The machine learning model can be linked to the plurality of processors. The machine learning model can be used to compare the to-be-evaluated information with the x types of extracted information to find a most similar type of extracted information, select an optimal power profile from the z power profiles according to the most similar type of extracted information, and apply the optimal power profile to control power provided to a subset of the plurality of processors. x, y and z are integers larger than zero, and the subset of the plurality of processors are of a same type of processor.
These and other objectives of the present invention will no doubt become obvious to those of ordinary skill in the art after reading the following detailed description of the preferred embodiment that is illustrated in the various figures and drawings.
The power consumption reduction system 100 can be a multi-core system where each of the processors 111 to 11m can be corresponding to a core. For example, the processors 111 to 11m can include two or more different types of cores of processors. For example, the processors 111 to 11m can include a big core for heavier workload and a small core for lighter workload. In another embodiment, the processors 111 to 11m can include a big core, a medium core and a small core.
The machine learning model 120 can have z power profiles PF1 to PFz corresponding to y operation scenarios (expressed as S1 to Sy) according to x types of extracted information I1 to Ix, where x, y and z can be integers larger than zero, and y≥z.
Some operation scenarios are exemplified as follows: Operation scenario S1 may be playing a video game. Operation scenario S2 may be playing a film on a hard disk. Operation scenario S3 may be accessing a streaming video. Operation scenario S4 may be performing a live stream.
In another example, Operation scenario S1 may be using a browser, Operation scenario S2 may be opening a document file (e.g. a file of Microsoft Office), Operation scenario S3 may be opening a PDF (portable document format) file, and Operation scenario S4 may be opening a three-dimensional image.
The x types of extracted information I1 to Ix can be extracted by detectors, counters and event interfaces in the processors 111 to 11m, and more related details will be described in
The machine learning model 120 can include a neural network, a supervised machine learning model, a non-supervised machine learning model, and/or a tree-based machine learning model. For example, the machine learning model 120 can include a neural network(s) such as a fully-connected neural network and/or a convolutional neural network. In another example, the machine learning model 120 can include a tree-based model such as a random forest model. According to embodiments, in the machine learning model 120, any appropriate neural network model(s) can be applied for machine learning.
The machine learning model 120 can be linked to the processors 111 to 11m through signal lines and/or wireless paths. According to embodiments, the machine learning model 120 can be linked to the processors 111 to 11m through hardware paths and/or software paths.
The machine learning model 120 can be used to compare the to-be-evaluated information IE with the x types of extracted information I1 to Ix to find a most similar type of extracted information Ii from the extracted information I1 to Ix, where i can be an integer and 0<i≤x. The machine learning model 120 can select an optimal power profile PFk from the z power profiles PF1 to PFz according to the most similar type of extracted information Ii. The machine learning model 120 can apply the optimal power profile PFk to control power provided to a subset of the plurality of processors 111 to 11m for reducing and optimizing the power consumption. The subset of the plurality of processors 111 to 11m are of the same type of processor. For example, the subset of the plurality of processors 111 to 11m can be graphical processing units (GPUs). In another example, the subset of the plurality of processors 111 to 11m can be central processing units (CPUs).
As described in Table 1, the relationships among the power profiles PF1 to PFz, the x types of extracted information I1 to Ix, and the scenarios S1 to Sy can be shown as below.
After the optimal power profile PFk is applied, the power consumed by the subset of the plurality of processors 111 to 11m can be measured. Ideally, the power consumed by the plurality of processors 111 to 11m should decrease after the optimal power profile PFk is applied to optimize the power consumption. However, if the power consumed by the subset of the plurality of processors 111 to 11m exceeds a power level with applying the optimal power profile PFk, it implies that none of the power profiles PF1 to PFz is able to reduce the power consumption, the machine learning model 120 can generate a new power profile (expressed as PFnew) and a corresponding new scenario (expressed as Snew) with corresponding new extracted information (expressed as Inew).
The new power profile PFnew can be used to control the power provided to the subset of the processors 111 to 11m. If the power consumption reduces with the new power profile PFnew, the machine learning model 120 can learn the new power profile PFnew and the corresponding new scenario Snew with the new extracted information Inew. If required, the new power profile PFnew can be adjusted to further improve the efficiency of power consumption.
In the power consumption reduction system 100, the processors 111 to 11m can include a central processing unit (CPU), a graphical processing unit (GPU), a tensor processing unit (TPU), and/or a neural network processing unit (NPU). According to requirements, the processors 111 to 11m can include other types of processing units.
The machine learning model 120 can be a machine learning engine implemented with a microcontroller. The machine learning engine(s) of the machine learning model 120 can be implemented in a hardware device (e.g. a specific circuit) and/or with software programs run on a microcontroller.
The x types of extracted information I1 to Ix collected from the processors 111 to 11m can be generated using a thermal detector, a current detector, a voltage detector, a bandwidth detector, a performance counter and/or an event interface.
Step 310: define the y operation scenarios S1 to Sy according to the x types of extracted information I1 to Ix;
Step 320: generate the z power profiles PF1 to PFz each used for controlling power provided to a subset of the plurality of processors 111 to 11m, where the subset of the plurality of processors are of the same type of processor;
Step 330: assign the z power profiles PF1 to PFz to the y operation scenarios S1 to Sy in the machine learning model 120;
Step 340: collect the to-be-evaluated information IE by the plurality of processors 111 to 11m;
Step 350: use the machine learning model 120 to compare the to-be-evaluated information IE with the x types of extracted information I1 to Ix to find the most similar type of extracted information Ii;
Step 360: use the machine learning model 120 to select the optimal power profile PFk from the z power profiles PF1 to PFz according to the most similar type of extracted information Ii;
Step 370: apply the optimal power profile PFk to control the power provided to the subset of the plurality of processors 111 to 11m;
Step 380: determine if power consumed by the plurality of processors 111 to 11m exceeds a power level with applying the optimal power profile Ii; if so, enter Step 385; else, enter Step 340;
Step 385: generate the new power profile PFnew and the corresponding new scenario Snew with the corresponding new extracted information Inew; and
Step 390: update the machine learning model 120 with the new power profile PFnew and the new scenario Snew; enter Step 340.
The machine learning model 120 can be generated in an off-line state and/or an on-line state where the machine learning model 120 is trained using pre-collected extracted information collected previously. The power consumption reduction method 300 can be performed in the off-line state where the used data can be pre-collected before performing the steps. The power consumption reduction method 300 can be performed in a run-time state where the processors 111 to 11m are in operation, and the data used for training the machine learning model 120 can be collected in real time while performing the power consumption reduction method 300.
According to embodiments, some steps of the power consumption reduction method 300 can be performed in an off-line state, and other steps can be performed in an on-line state (a.k.a. run-time state). For example, Steps 310 to 330 of
According to the currently running application, the to-be-evaluated information IE can be dynamically updated in Step 340 and be evaluated in Step 350.
In Step 360, the machine learning model 120 can perform classification to select the optimal power profile PFk from the power profiles PF1 to PFz. Since there may be millions of applications, it is not practical to assign a specific power profile for each application. Hence, the machine learning model 120 can perform classification to select an appropriate power profile (e.g. PFk) from a limited number of power profiles (e.g. PF1 to PFz). The classification capability of the machine learning model 120 can be trained.
For example, the machine learning model 120 can be generated to have the z power profiles PF1 to PFz corresponding to the y operation scenarios S1 to Sy according to the x types of extracted information I1 to Ix. Then, in the run-time state where the processors 111 to 11m are in operation, the detectors, performance counters and event interfaces can be used to collect the to-be-evaluated information IE for the machine learning model 120 to select the optimal power profile PFk accordingly. Then, Step 370 and Step 380 can be performed to dynamically control the power provided to the subset of the processors 111 to 11m. As mentioned in Step 380 and Step 390, if the optimal power profile PFk is unable to reduce the power consumption, the new power profile PFnew can be generated and applied in real time in the run-time state. In Step 385 and Step 395, the machine learning process is performed since the machine learning model 120 can learn the new power profile PFnew and the corresponding new scenario Snew in addition to the z power profiles PF1 to PFz and the y operation scenarios S1 to Sy. After the new power profile PFnew is generated, the new power profile PFnew can be further adjusted and fine-tuned.
Each of the power profiles I1 to Ix can be corresponding to performances of the subset of the processors 111 to 11m, and the power provided to the subset of the processors 111 to 11m.
In summary, through the power consumption reduction system 100 and the power consumption reduction method 300, the machine learning model 120 is trained to perform machine learning to select an optimal power profile (e.g. PFk) and generate a new power profile (e.g. PFnew) when necessary for different applications. Even if the overall workload is the same, depending on the applications, different power profiles may be selected. For example, the operation frequency of the system can be reduced, while the frame rate (in frame per second) is approximately maintained. Hence, the power consumption is reduced while the performance is maintained. As a result, the power consumption is optimized and effectively reduced.
Those skilled in the art will readily observe that numerous modifications and alterations of the device and method may be made while retaining the teachings of the invention. Accordingly, the above disclosure should be construed as limited only by the metes and bounds of the appended claims.
This application claims the benefit of U.S. Provisional Application No. 63/423,063, filed on Nov. 7, 2022. The content of the application is incorporated herein by reference.
Number | Date | Country | |
---|---|---|---|
63423063 | Nov 2022 | US |