This application claims benefit under 35 U.S.C. §119(a) and 37 CFR 1.55 to UK patent application no. 1020662.1, filed on Dec. 7, 2010, the entire content of which is hereby incorporated by reference.
The present invention relates to monitoring processes in a computer. An embodiment of the invention relates to controlling the power consumption of a computer.
A computer, for example a server, may have a specific role allocated to it for providing a specific service to users. Processes running on the computer which provide that service are regarded as productive. Other processes such as defragmentation or virus checking, whilst important, do not directly provide a service to a user and are regarded as non-productive. It is desirable to distinguish productive processes from non-productive processes. Whilst processes have names and one might think that productive and non-productive processes may be distinguished from the names, in fact names are arbitrary. For example “Minesweeper” is the name of a well known computer game but any other process could have the name “Minesweeper”. Furthermore, a computer only carries out the instructions in a process and has no “knowledge” of the purpose of the process. In a large computer system there may be hundreds or more computers and records of what is running on them may be inaccurate. Manually reviewing all processes on all the computers is impractical in large systems.
It is desirable to automatically distinguish between productive and non- productive processes running on a computer to enable better administration and/or control of the computer and/or better administration of a computer system.
In accordance with one aspect of the present invention, there is provided a method comprising running on the computer a monitoring program which identifies a process running on the computer, and, for the identified process, determines whether or not one or more predetermined characteristics of the process comply with respective reference characteristics thereby to automatically distinguish whether the process is likely to be a productive process or a non-productive process.
In accordance with another aspect of the present invention, there is provided a method comprising running on the computer a monitoring program which identifies a process running on the computer, and, for the identified process, detecting the time pattern of running of the identified process and the resources it uses when running thereby to automatically determine whether the process is likely to be a productive process or a non-productive process.
Further features and advantages of the invention will become apparent from the following illustrative description of embodiments of the invention, given by way of example only, which is made with reference to the accompanying drawings.
An illustrative example of the present invention comprises a network of computers: see
The data sets may be used for power control of the computers based on “net useful work” which is a value of activity of the computer excluding contributions to the activity of the excluded processes and network connections: see FIGS. 7 to 10.. The method of power control of
Overview of an example of a system in accordance with the invention:
Referring to
Referring to
It should be noted that the present invention may be implemented in software and/or in a combination of software and hardware, e.g., using application specific integrated circuits (ASIC), a general purpose computer or any other hardware equivalents. In one embodiment, the various processes can be loaded into memory 240 and executed by processor 222 to implement the functions herein. As such the processes (including associated data structures) of the present invention can be stored on a computer readable medium or carrier, e.g., RAM memory, magnetic or optical drive or diskette, and the like.
The administrator's workstation 6 interacts with the database 8. The webservice 62 interacts with the database and the computers 2n. The database may itself comprise a server 81 having a data storage device 82. The database 8 and the workstation 6 together form a monitoring system 68.
In this example of the invention, as illustrated in
As illustrated by
The database processes the data to store it in an organised way.
In one embodiment, raw data produced by the monitoring program A of a computer 2n is analysed in computer 2n as discussed below with reference to
Consider one of the computers 2n. The illustrative program of
In the illustrative procedure described now with reference to
The certainty value Vt is the aggregate of values V0 to Vp associated with P+1 tests. Each test is indicative of whether a process running on the computer has a characteristic indicative of a productive process or of an unproductive process.
In the illustrative example of
Referring by way of example to
In step S4, the monitoring program finds from the operating system the name of a process which is running on the computer 2n, and compares the name with the downloaded list of processes not to be used in the program of
Step S10 determines if the time and resource data of the process has been recorded on a previous occasion by comparing the name of the process with name data previously stored in step S8. If the process has not been recorded previously, at step S20 a certainty value is set to a predetermined initial amount V0 which may for example be zero but could be another number.. The certainty value is stored. Also an iteration number In is set to I1 for the first iteration of the procedure of
If the process has been recorded before there will be a certainty value Vt associated with the process from the one or more previous occurrences of the procedure of
Steps S10 to S20 determine the time pattern of running of each process and the time pattern of resource usage of each process. In this example, if a process is found to run at the same time (within a predetermined tolerance) as on a previous occurrence and uses the same resources as on the previous occurrence, (and ignoring the issue of whether the same result was found on a previous iteration), the certainty value is incremented indicating that there is an increased certainty that the process is a non-productive process, for example virus checking or defragmentation, which as a matter of policy runs at regular times and tends to always use the same resources.. Conversely, if the certainty value decreases, that indicates there is a greater certainty that the process is a productive process serving a user because experience shows that user related services occur at times which are more random and use resources more randomly than non-productive processes.
It will be appreciated that the time data may be processed separately from the resource data and each used to provide separate increments (or decrements) to the certainty value. Resource usage is recorded together with the time of usage of the resources. The method then proceeds to steps S22 to S46 where it is deter wined if one or more other characteristics of the monitored process comply with one or more predetermined reference characteristics. For the, or each, characteristic, (again ignoring the issue of whether the same result was found on a previous iteration), the certainty value is incremented or decremented. Steps S22 to S46 may be carried out as an alternative to steps S8 to S20 or as shown in
Referring to step S14, if the process has been seen before on an iteration In−1, step S14 tests whether the timing and resource data of iteration In match those of iteration In−1. If the answer is No (indicating a productive process) and if the result of the test of step S14 at iteration In differs from the result at iteration In−1, then the certainty value Vt is decremented by 1 at step S18 and the result of the test is stored with the iteration number. If the answer is Yes (indicating a non-productive process), and if the result of the test of step S14 at iteration In differs from the result at iteration In−1, then the certainty value Vt is incremented by 1 at step S16 and the result of the test is stored with the iteration number. If the result of the test of step S14 at iteration In is the same as the result at iteration In−1, then the certainty value Vt is unchanged, but the result of the test is stored with the iteration number.
Step S22 determines if the monitored process receives input from for example the keyboard 10 or pointing device 12 or from another human interface device or another source of input. A PC or laptop typically receives direct input via a keyboard or pointing device or other human interface. A server may receive input indirectly from a client. If the answer is No (indicating an unproductive process) and if the result of the test of step S22 at iteration In differs from the result at iteration In−1, then the certainty value Vt is incremented by 1 at step S26. If the answer is Yes (indicating a productive process), and if the result of the test of step S22 at iteration In differs from the result at iteration In−1, then the certainty value Vt is decremented by 1 at step S24. If the result of the test of step S22 at iteration In is the same as the result at iteration In−1, then the certainty value Vt is unchanged. The result of the test S22 is stored with the iteration number.
Step S28 determines if the monitored process connects to a known IP address, i.e. one of the IP addresses of the list downloaded in step S2. If the process connects only to the known IP address, then that is indicative of a non-productive process. If the answer is No (indicating an productive process) and if the result of the test of step S28 at iteration In differs from the result at iteration In−1, then the certainty value Vt is incremented by 1 at step S32. If the answer is Yes (indicating a non-productive process), and if the result of the test of step S28 at iteration In differs from the result at iteration In−1, then the certainty value Vt is decremented by 1 at step S30. If the result of the test of step S28 at iteration In is the same as the result at iteration In−1, then the certainty value Vt is unchanged. The result of the test S28 is stored with the iteration number.
Step S34 determines if the process is a “service” which is a Windows term which describes a process that generally runs without the user being aware and for the present purposes is regarded as unproductive. An example would be the process which keeps the clock in sync by periodically making a network connection to a trusted clock. The Operating System indicates if the process is classified as a service. In Unix Operating Systems these types of process are called Daemons.
If the answer is No (indicating an productive process) and if the result of the test of step S34 at iteration In differs from the result at iteration In−1, then the certainty value Vt is decremented by 1 at step S38. If the answer is Yes (indicating a non-productive process), and if the result of the test of step S34 at iteration In differs from the result at iteration In−1, then the certainty value Vt is incremented by 1 at step S36. If the result of the test of step S34 at iteration In is the same as the result at iteration In−1, then the certainty value Vt is unchanged. The result of the test S34 is stored with the iteration number.
Step S40 determines if the monitored process is running in user context.
If the answer is No (indicating an unproductive process) and if the result of the test of step S40 at iteration In differs from the result at iteration In−1, then the certainty value Vt is incremented by 1 at step S44. If the answer is Yes (indicating a productive process), and if the result of the test of step S40 at iteration In differs from the result at iteration In−1, then the certainty value Vt is decremented by 1 at step 42. If the result of the test of step S40 at iteration In is the same as the result at iteration In−1, then the certainty value Vt is unchanged. The result of the test S40 is stored with the iteration number.
In step S46 the aggregate certainty value Vt is stored in the data store with the name of the monitored process and other data gathered in step S8.
The monitoring process than proceeds to monitor another process at step S6.
In the description above, Vt is incremented or decremented by 1 on all tests. However, the tests may be associated increments and decrements as follows.
The values V1 to V10 may be the same. Alternatively they may be different allowing different ones of the tests S14, S22, S28, S34 and S40 to have different weightings and to allow different weightings for different outcomes of the criteria; e.g. an increment may be different from a decrement on a particular test.
The aggregate value Vt is compared in step S47 (
The two threshold values may be made equal or replaced by a single threshold.
Network Connections
The program of
The monitoring procedure described above produces a data set, or list, of non-productive processes. The data set may be used for the control of power consumption of the monitored computer. The data set may be used for the control of power consumption of other computers. As described in the following example of power control, a value of net useful work is calculated for the purpose of power control. The net value is the value of activity excluding contributions to that value of predetermined processes referred to as “excluded processes”. The excluded processes are the non-productive processes identified in a data set. The monitoring procedure described above with reference to
The illustrative methods use net values of activity. In the method of
Referring to
The method of
In step S75, in each time slot t the method checks the net values of: CPU activity; I/O data amount; and number of TCP/IP connections. Each net value has an associated threshold value. The net values are compared with the thresholds in step S75.
If any one or more of the net values exceeds its associated threshold, indicating net useful work the first timer is reset in step S77 and step S78 determines if the computer is in the full power state. If it is in the full power state no further action is required and the first timer starts to time its period P at step S72 and the method continues to measure the net values and compare them with the thresholds in the next time slot t. If the computer is in the low power state it is forced into the full power state and the first timer starts to time its period P at step S72 and the method continues to measure the net values and compare them with the thresholds in the next time slot t.
If over the whole period P step S75 does not detect any net useful work then the first timer is not reset and at the end of the period P, in step S74, the computer adopts the low power state and the first timer is stopped. The second timer continues at step S73 and step S75 continues until net useful activity is detected and then the first timer is reset to time period P.
In this example, useful activity is in two categories: the activities sampled in step S75 which are sampled once per minute and thus do not have immediate effect on the power state of the computer; and other activities which immediately cause the computer to adopt the full power state if it was in the low power state as indicated by step S80. In this example there is only one such other activity which is a logon S80 by a user. In alternative embodiments, other events may cause the computer to adopt the high power state S79 or reset S77 the first timer. Such events may be a network event, a process appearing, an activity metric associated with a particular process crossing a threshold value, a service starting or an operating system event occurring. In an embodiment, the user or designer may specify one or more events which cause the computer to adopt the high power state or reset the first timer.
Logon may be included in the activities sampled in step S75. Any of the activities of step S75 may be subject to step S73.
Determine net useful CPU activity:
Net useful CPU activity is measured as shown in
In step S60, the total value of CPU activity is determined at the time of a time slot t and the total value is stored. The total value includes for example contributions from all processes running on the computer at the time of measurement plus activity attributable to the kernel of the operating system.
In steps S62 to S68, the contributions to the total value from all the excluded processes running at the time of measurement of the total are determined and subtracted from the total value to produce a net value. In this example that is done by selecting a process in step S62 from the data set of excluded processes, determining the activity value attributable to that excluded process in step S64, storing the activity value in an accumulator in step S66 and then at steps S68 and S62 selecting the next process and adding its activity value to the value stored in the accumulator in step S66. Once all the processes have been selected the value accumulated in step S66 is subtracted in step S69 from the total stored in step S60 to give the net value.
It will be appreciated that there are other methods of determining net useful CPU activity. For example the activity values of the excluded processes may be subtracted one at a time from the total value of CPU activity instead of accumulating all the activity levels and then subtracting the accumulated values from the total CPU activity value.
The total activity of the CPU as measured in the time slot t and the activity values of the excluded processes are derived from the operating system in known manner using performance counters.
Determine net useful TCP/IP connections:
Net useful connections are determined as shown in
The identification of an incoming TCP/IP connection is achieved using port numbers and processes which are provided by instrumentation data provided by the operating system. Information on how to do this is available from Microsoft Corporation for operating systems supplied by them but the invention is not limited to Microsoft's operating systems. The list of excluded incoming TCP/IP connections is a list of port numbers and processes associated with those port numbers. The following may also be identified and used in the list: source IP addresses of incoming network connections, and other data for example data identifying any connection to a process X, any connection to a port Y or any connection from a source address Z.
In an alternative implementation, in a time slot t, the total number of all incoming TCP/IP connections is determined, the number of those connections on the excluded list is determined and the number of excluded connections is subtracted from the total number of all incoming TCP/IP connections.
Determine net useful I/O activity:
The measure of I/O activity is the average number of bytes being read and written over the measurement period.
In this example, I/O activity is a single value which is the sum of network I/O, disc I/O and device I/O.
Net useful I/O activity is determined as shown in
Steps S98 and S99 may be implemented as shown in
I/O activity associated with the storage of the computer may be monitored separately from network I/O. Also device I/O may be monitored separately. If so, net useful values are determined separately for each type of I/O activity.
Lower Power state
As described above, the computer is in a full power state until no useful activity occurs for a period of time P, when it adopts a lower power state. In an example of a lower power state the computer is set to operate in its lowest power state whilst still fully operational. For example the CPU is controlled to operate in its minimum power state with the clock at its lowest frequency setting, and network cards and other cards of the computer are set to their lowest power state and lowest frequency of operation.
Variants
The above embodiments are to be understood as illustrative examples of the invention. Further embodiments of the invention are envisaged.
The example of power control described above monitors incoming TCP/IP connections. That aspect of the invention is not limited to TCP/IP but may be applied to other connection oriented communications protocols. The invention is not limited to monitoring incoming connections: it may monitor outgoing connections in addition to or instead of monitoring incoming connections.
The example of power control described above deems any single log-on to be useful activity. That aspect of the invention is not limited to a single logon: it may require a minimum number of logons greater than one to signify useful work. An embodiment of the invention may use a data set of one or more excluded logons. For example a logon which is not associated with an external service may be deemed to be a non-productive activity. For example, a logon to an account that is used only for maintenance tasks may be considered to be a non-productive activity.
The computers 2n of the network of
Examples of power control have been described which involve monitoring a plurality of activities, for example CPU activity, I/O activity, network connections and logons. However, power control may be implemented by monitoring two activities for example CPU activity and I/O activity; or three activities. More than four activities may be monitored. For example a single measure of I/O activity may be replaced by separate measures of network I/O, disc I/O and device I/O.
Whilst the invention has been described by way of example as using programs running on each of the computers 2n to monitor and control the computers, the computers may be monitored and controlled remotely by for example the monitoring system 68 of
The embodiments of power control described above sample the total values of one or more activity metrics in each of a succession of time slots. However, an alternative embodiment uses an event monitor instead of time slots and senses the occurrence of an event to initiate sampling of total values and determine the net values.
Whilst the invention has been described by way of example to changing the aggregate certainty value if a test result on an iteration differs from the preceding result, other ways of calculating certainty values may be used.
Computer Programs and program carriers.
The invention may be implemented by a program or a set of programs, which when run on a computer or set of computers causes the computer(s) to implement the methods described herein above. In one implementation of the invention:-
The programs may be carried by one or more non-transitory computer-readable storage medium having computer readable instructions stored thereon or carriers. A carrier may be a signal, a communications channel, or a computer readable medium. A computer readable medium may be an article for example: a tape: a disc for example a CD or DVD: a hard disc: an electronic memory; or any other suitable non transitory carrier or data storage medium. The electronic memory may be a ROM, a RAM, Flash memory or any other suitable electronic memory device whether volatile or non-volatile.
It is to be understood that any feature described in relation to any one embodiment may be used alone, or in combination with other features described, and may also be used in combination with one or more features of any other of the embodiments, or any combination of any other of the embodiments. Furthermore, equivalents and modifications not described above may also be employed without departing from the scope of the invention, which is defined in the accompanying claims. It will be appreciated from the foregoing description that the claims may be combined in combinations other than those specifically recited in the claims.
It is contemplated that some of the steps discussed herein as software methods may be implemented within hardware, for example, as circuitry that cooperates with the processor to perform various method steps. Portions of the functions/elements described herein may be implemented as a computer program product wherein computer instructions, when processed by a computer, adapt the operation of the computer such that the methods and/or techniques described herein are invoked or otherwise provided. Instructions for invoking the inventive methods may be stored in fixed or removable media, transmitted via a data stream in a broadcast or other signal bearing medium, and/or stored within a memory or other non-transitory means within a computing device operating according to the instructions.
Although various embodiments which incorporate the teachings of the present invention have been shown and described in detail herein, those skilled in the art can readily devise many other varied embodiments that still incorporate these teachings.
Number | Date | Country | Kind |
---|---|---|---|
1020662.1 | Dec 2010 | GB | national |