The invention relates to programmable data processing devices and in particular to the control of program execution by such devices.
The computer hardware and programs can be optimized by the use of profiling. The term “profiling”, as used in the art refers to gathering of statistical data concerning hardware use or program execution, such as counts of the number of times that an instruction or block of instructions from a program is executed. The statistical data is called a “profile”. EP 1331565 describes a method of profiling execution of JAVA programs on a target machine and the use of the resulting profile to optimize the programs for subsequent use.
U.S. patent application No. 2006/75286 describes a method of “harvesting” profiles after computers have been provided to end users. This method involves locally generating profiles of hardware use on the computers, such as counts of the number of time the computers are switched on and off etc, and uploading the locally generated profiles to a central data base. As described in this document the harvested profiles can be used to realize improved designs of future computers, or to adapt the warranty of components like batteries etc. Unfortunately, the known profiling methods do not provide for improvement of programs after the programs have been supplied to end users other than by providing new releases of the programs. Known profiling is limited to pre-distribution improvement.
Among others, it is an object to provide for improved performance of computer programs after distribution to programmable devices of end users.
A method according to claim 1 is provided. Herein statistical data is gathered from program execution by a plurality of devices. The statistical data is uploaded from the plurality of devices to a common profiling apparatus where it is collected. The collected statistical data from a plurality of programmable devices is used to assign operating points to different execution states of the program. In an embodiment the operating points may define power supply voltages and/or clock frequencies of the programmable device that will be used in different states. The assignment of operating points may for example be performed centrally, in the common profiling apparatus and downloaded to the programmable devices, or in the programmable devices after downloading the collected data.
In an embodiment the data is collected after the programmable devices have been provided to different users, during executions of the program that are started and/or controlled by the user. In this way a broad range of characteristic operation conditions can be used to collect profile data. A plurality of programs that may be executed by the programmable devices may be handled in this way, collecting statistical data for specific different programs when they are executed at individual processing devices at different times and gathering the statistical data associated with the different programs.
These and other objects and advantageous aspects will become apparent from a description of exemplary embodiments.
Each programmable device 10 comprises a power supply circuit 100, a clock circuit 102 and a processing circuit 104 with a power supply input coupled to power supply circuit 100 and a clock input coupled to clock circuit 102, as well as control outputs coupled to control inputs of power supply circuit 100 and clock circuit 102. 0ptionally, each programmable device 10 comprises a user interface 106, such as a group of buttons, or a touch screen interface etc, coupled to processing circuit 104. It should be appreciated that this arrangement is shown merely by way of example. In practice, each programmable device 10 may use a plurality of power supply voltages simultaneously, as well as a plurality of clock signals. Furthermore programmable device 10 may comprise a plurality of processing circuits that are coupled to receive mutually different supply voltages and clock signals, or mutually different combinations of a plurality of supply voltages and clock signals.
The combination (V, f) of power supply voltages V and frequencies f supplied to the one or more components of programmable device 10 is referred to as an operating point of the programmable device 10. Both processing capacity and power consumption depend on the operating point. Lowering clock frequencies and/or power supply voltages reduces power consumption. Lowering power supply voltage reduces the maximum usable clock frequency. Lowering a clock frequency reduces the amount of computation that can be performed in a time interval of a given duration. Setting of the operating point involves a compromise between achieving sufficient processing speed to perform a required task in a specified time and minimizing power consumption.
In operation new programs are distributed from program supply system 16 to programmable devices 10. This may be done via network 12 or via other routes. During execution of the programs programmable devices 10 vary the operating point in order to reduce power consumption. Execution involves successive transitions to a plurality of execution states. As used herein, an execution state may be characterized by a block of instructions in the program that is executed, parameter values of data parameters supplied for use in that block and optionally any other settings of the programmable device 10 that affect operation. Also a history of previously executed blocks of instructions may be part of the state.
In order to minimize power consumption, the operating point of the programmable device 10 is set according to the execution state of the programmable device. To select the operating point as a function of the execution state, a programmable device 10 needs to have information that relates states to operating points. The operating points that follow from the information should minimize power consumption, while ensuring that the program performs tasks within time intervals of predetermined duration. For refined power consumption control such information is needed for many states or even for a quasi continuum of states the definition of the state involves one or more quasi-continuous parameters. A considerable amount of information is needed to realize optimal control. This information is gathered using a plurality of programmable devices 10.
In this embodiment, the processing circuit 104 controls its own operation. Alternatively, a separate control circuit may be provided, with a memory containing information that links operating points to states, with an input coupled to the processing circuit, for monitoring the state of the processing circuit 104 and outputs coupled to the clock control circuit 102 and the power supply control circuit 10 for controlling the operating point dependent on the detected state of the processing circuit 104 and the link defined in the memory.
At any time when a program is executed a programmable device 10 may gather profile data of the execution of that program. This may be done for all executions, or merely on a sample basis, for part of the executions. Gathering of data may be performed under software control, for example by means of commands in the executed program itself. Alternatively, programmable device may contain an application program interface that collects data when it is called by the program, or by an operating system that transfers control to parts of the program from time to time. The gathered profile data is statistical data, in the sense that it comprises counts of events that have occurred during program execution and/or statistics (averages, variances, and/or histograms etc.) of values detected during execution.
In a second step 22 programmable devices 10 send records containing identifications of the program and the relevant states and profile data for the program in that state to profiling apparatus 14. Different ones of programmable devices 10 need not execute this step simultaneously. The information about execution of a program may be sent immediately upon execution of the program, or data of a plurality of instances of execution may be gathered before transmission. The profile data may include an indication that the state has been reached, or a count of how many times the state was reached in a specified time interval, or the number of instruction cycles time needed to complete execution of a block of instructions in the state, or a successor state selected after leaving the state, or a history of prior states etc.
In a third step 23 profiling apparatus 14 collects the profile data from a plurality of programmable devices 10 for a program and its states. Collection may comprise collecting statistics for different states, such as the frequency with which the state is visited, the average or maximum number of instruction cycles before leaving the state, the probabilities of subsequent transitions to different states etc. Collection may also involve estimations of relations (e.g. coefficients of linear relations) between such quantities and quasi-continuous parameters whose values distinguish different states.
In a fourth step 24 profiling apparatus 14 computes a relation between states and operating points for the program from the collected information. Methods of selecting operating points from profile data are known per se and will therefore not be described in detail. In an exemplary embodiment, operating point selection involves an optimization criterion (expected power consumption) and constraints such as a maximum time duration needed to reach a first state from a second state. A set of operating points for a set of states is selected that optimizes the value of optimization criterion (minimizes expected power consumption) while satisfying the constraints. Herein the value of the optimization criterion and/or constrained values depend on the set of operating points in a way determined by the profile data, e.g. through the frequency at which a state is visited or the average durations of the time intervals during which the programmable device 10 remains in respective states, or the frequency with which a sequence of states occurs etc.
In a fifth step 25 profiling apparatus 14 transmits the computed relation between states and operating points for the program back to programmable devices 10. Subsequently, in a sixth step 26 programmable devices 10 set their operating points dependent on their state according to the transmitted relation. Different ones of programmable devices 10 need not execute this step simultaneously. Programmable devices 10 may set the operating points for example by executing corresponding instructions to output control signals to control power supply circuit 100 and clock input coupled to clock circuit 102. After this the process may repeat from first step 21 to realize further improvements.
As will be appreciated the effect of using a plurality of programmable devices 10 to gather profile data associated with programs is that data about all states, or nearly all states becomes available much sooner than when profile data is collected from only one programmable device 10. Furthermore it is made possible to optimize power consumption for programs that become available after programmable devices 10 have been manufactured. It should be appreciated that the flow chart of
Other variations to the disclosed embodiments can be understood and effected by those skilled in the art in practicing the claimed invention, from a study of the drawings, the disclosure, and the appended claims. In the claims, the word “comprising” does not exclude other elements or steps, and the indefinite article “a” or “an” does not exclude a plurality. A single processor or other unit may fulfill the functions of several items recited in the claims. The mere fact that certain measures are recited in mutually different dependent claims does not indicate that a combination of these measured cannot be used to advantage. A computer program may be stored/distributed on a suitable medium, such as an optical storage medium or a solid-state medium supplied together with or as part of other hardware, but may also be distributed in other forms, such as via the Internet or other wired or wireless telecommunication systems. Any reference signs in the claims should not be construed as limiting the scope.
Number | Date | Country | Kind |
---|---|---|---|
07114385 | Aug 2007 | EP | regional |
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/IB2008/053247 | 8/13/2008 | WO | 00 | 2/12/2010 |
Publishing Document | Publishing Date | Country | Kind |
---|---|---|---|
WO2009/022302 | 2/19/2009 | WO | A |
Number | Name | Date | Kind |
---|---|---|---|
6519766 | Barritz et al. | Feb 2003 | B1 |
6735758 | Berry et al. | May 2004 | B1 |
6862696 | Voas et al. | Mar 2005 | B1 |
7735073 | Kosche et al. | Jun 2010 | B1 |
7788644 | Koduru et al. | Aug 2010 | B2 |
7827543 | Kosche et al. | Nov 2010 | B1 |
8032875 | Kosche et al. | Oct 2011 | B2 |
8065665 | Kosche et al. | Nov 2011 | B1 |
8166462 | Kosche et al. | Apr 2012 | B2 |
8176475 | Kosche et al. | May 2012 | B2 |
8230059 | Santos et al. | Jul 2012 | B1 |
8640114 | Kosche et al. | Jan 2014 | B2 |
20020100025 | Buechner et al. | Jul 2002 | A1 |
20040210877 | Sluiman et al. | Oct 2004 | A1 |
20050125784 | Yang et al. | Jun 2005 | A1 |
20060075286 | Hodge et al. | Apr 2006 | A1 |
20070032992 | Trowbridge et al. | Feb 2007 | A1 |
20070079353 | Boortz | Apr 2007 | A1 |
20080109796 | Kosche et al. | May 2008 | A1 |
20080127149 | Kosche et al. | May 2008 | A1 |
20110138366 | Wintergerst et al. | Jun 2011 | A1 |
Number | Date | Country |
---|---|---|
1331565 | Jul 2003 | EP |
2404264 | Jan 2005 | GB |
2 448 952 | Nov 2008 | GB |
9415858 | Jul 1994 | WO |
Entry |
---|
C.B. Lirakis and K.P. Bongiovanni; Automated Multibeam Data Cleaning and Target Dectection; 2000; IEEE; retrieved online on Apr. 4, 2014; pp. 719-723; Retrieved from the Internet: <URL:http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=881336>. |
Priya Nagpurkar et al.; Efficient Remote Profilling for Resource-Constrained Devices; Mar. 2006; ACM; retrieved online on Apr. 4, 2014; pp. 35-66; Retrieved from the Internet: <URL:http://delivery.acm.org/10.1145/1140000/1132465/p35-nagpurkar.pdf?>. |
Leonard Hart et al.; The Challenge in Balancing Data Collection Innovations, Remaining Practical, and Being Cost-Effective; Apr. 2012; Mathematica Policy Research Corp.; retrieved online on Apr. 4, 2014; pp. 3-13; Retrieved from the Internet: <URL: http://www.blaiseusers.org/2012/papers/01a.pdf>. |
European Search Report for EP Patent Appin. No. EP 10250113.7 (May 3, 2010). |
Kotla, Ramakrishna, et al; Scheduling Processor Voltage and Frequency in Server and Cluster Systems; Parallel and Distributed Processing Symposium, 2005, Denver, CO, US; IEEE; Piscataway, NJ, US; Apr. 4, 2005; p. 1-8. |
Number | Date | Country | |
---|---|---|---|
20110214022 A1 | Sep 2011 | US |