Claims
- 1. A method for reducing the amount of data of system metrics collected or reported from agent nodes to a system performance monitor for system performance monitoring and analysis, the method comprising the steps of:
sampling a first system metric and obtaining a sampled value of the first metric; reporting the sampled value of the first metric if the sampled value is not between a first parameter and a second parameter; not reporting the sampled value if the sampled value is between the first and second parameters; and wherein the first parameter and the second parameter are any real numbers.
- 2. The method in claim 1,
wherein the first parameter and the second parameter are derived from sampled values of the first system metric.
- 3. The method in claim 1,
wherein the first parameter and the second parameter are derived from at least one statistical parameter of the sampled values of the first system metric.
- 4. The method in claim 3,
wherein the at least one statistical parameter of the sampled values of the first system metric includes the first moment of the sampled values.
- 5. The method in claim 4,
wherein the at least one statistical parameters of the sampled values of the first system metric further includes the second moment of the sampled values.
- 6. The method in claim 1, further comprising,
assuming the sampled value of the first metric that is not reported with an average, wherein the average is an average of previously sampled data of the first system metric.
- 7. The method in claim 1,
wherein the average is a running average.
- 8. The method in claim 1, further comprising,
assuming the sampled value of the first metric that is not reported with an average, wherein the first parameter is zero and the second parameter is a positive number.
- 9. The method in claim 1, further comprising,
calculating a weighted running average, wherein {overscore (d)}n(w)=dnw+{overscore (d)}n−1(1−w), {overscore (d)}n and {overscore (d)}n−1 are the weighted running average after n'th or (n−1)'th sampling, w is the weighing factor for the sampling,Sn=Sn−1+(n−1)(dn−{overscore (d)}n−1)2/nσn2=Sn/n, wherein Sn, Sn−1 are the sum of the differences squared, σn is the standard deviation, calculating the first parameter to be ({overscore (d)}n−aσn) and calculating the second parameter to be ({overscore (d)}n+bσn), wherein a and b are two constant real numbers.
- 10. The method in claim 9,
wherein a and b are any real numbers between 0.5 and 3.1.
- 11 The method in claim 10,
wherein a and b are 1.
- 12. The method in claim 9, wherein continuing sampling is repeated for N times, wherein N is an integer.
- 13. The method in claim 12, wherein the w is between 1/N and 2/N.
- 14. The method in claim 12,
wherein N is determined by the confidence interval cl, the tolerable variance error ev, wherein 11ev=100 f (cl)2N,wherein f(cl) is the (1+cl/100)/2-quantile of the unit normal distribution.
- 15. The method in claim 9, further comprising:
reporting the weighted running average {overscore (d)}iN where iN is a multiple of N, i is an integer; reporting {overscore (d)}n and replacing {overscore (d)}iN with {overscore (d)}n when the |{overscore (d)}n−{overscore (d)}iN| is greater than dd, wherein dd is a real number.
- 16. The method in claim 15, wherein dd is σn.
- 17. The method in claim 9,
wherein the w=c/n, wherein c is a real number, n is the n'th sampling.
- 18. The method in claim 17,
wherein c is between 0.5 and 2.
- 19. The method in claim 1, further comprising:
sampling a second system metric and obtaining a sampled value of the second system metric; calculating the correlation coefficient cc between the sampled value of the first system metric and the second system metric after M sampling; stopping sampling and stopping reporting the sampled value of the second system metric if |cc| is not less than a threshold; and continuing sampling and reporting the sampled value of the second system metric if |cc| is less than a threshold, wherein |cc| is the absolute value of correlation coefficient cc.
- 20. The method in claim 1, further comprising:
at the system performance monitor, receiving the reported sampled value of the first metric; at the system performance monitor, assuming the sampled value of the first metric as an average for the sampled value not reported.
- 21. The method in claim 20, further comprising:
displaying the received and assumed values of the first metric.
- 22. A method for reducing the amount of data of system metrics collected or reported from agent nodes to a system performance monitor for system performance monitoring and analysis, the method comprising the steps of:
sampling a first and a second system metrics and obtaining sampled values of the first and second system metrics; calculating the correlation coefficient cc between the sampled value of the first system metric and the second system metric after M sampling, wherein M is an integer; stopping sampling and stopping reporting the sampled value of the second system metric if |cc| is not less than a threshold; and continuing sampling and reporting the sampled value of the second system metric if |cc| is less than a threshold, wherein |cc| is the absolute value of correlation coefficient cc.
- 23. The method in claim 22, wherein the threshold is 0.7.
- 24. The method in claim 22, wherein the threshold is 0.9.
- 25. The method in claim 22, wherein further comprising:
after stopping sampling and stopping reporting the sampled value of the second system metric if |cc| is not less than a threshold, estimating the value of the second system metric using the reported value of the first system metric when the first system metric is reported.
- 26. A computer system module for system performance monitoring, reporting and analysis, the module comprising:
a controller module operative to control the system performance monitoring; a sampling module coupled to the controller module, operative to sample at least a first system metric and obtaining a sampled value of the first metric; a reporting module coupled to the sampling module, operative to report each sampled value of the first metric if the sampled value is not between a first parameter and a second parameter, and
not to report the sampled value, if the sampled value is between the first and second parameters; wherein the first parameter and the second parameter are any real numbers.
- 27. The computer system module as in claim 26,
wherein the first parameter and the second parameter are derived from sampled values of the first system metric.
- 28. The computer system module as in claim 26,
wherein the first parameter and the second parameter are derived from at least one statistical parameter of the sampled values of the first system metric.
- 29. The computer system module as in claim 28,
wherein the at least one statistical parameter of the sampled values of the first system metric includes the first moment of the sampled values.
- 30. The computer system module as in claim 29,
wherein the at least one statistical parameters of the sampled values of the first system metric further includes the second moment of the sampled values.
- 31. The computer system module as in claim 26,
wherein the controller module is operative to calculate an average, wherein the average is an average of previously sampled data of the first system metric.
- 32. The computer system module as in claim 26,
wherein the controller module is operative to calculate an average, wherein the average is a running average.
- 33. The computer system module as in claim 26,
wherein the first parameter is zero and the second parameter is a positive number.
- 34. The computer system module as in claim 26,
wherein the controller module is operative to calculate a weighted running average,
wherein {overscore (d)}n(w)=dnw+{overscore (d)}n−1(1−w), {overscore (d)}n and {overscore (d)}n−1 are the weighted running average after n'th or (n−1)'th sampling, w is the weighing factor for the sampling,Sn=Sn−1+(n−1)(dn−{overscore (d)}n−1)2/nσn2=Sn/n, wherein Sn, Sn−1 are the sum of the differences squared, σn is the standard deviation, and to calculate the first parameter to be ({overscore (d)}n−aσn) and the second parameter to be ({overscore (d)}n+bσn), wherein a and b are two constant real numbers.
- 35. The computer system module in claim 26, wherein the controller module is operative to stop sampling after N times, wherein N is an integer.
- 36. The computer system module in claim 35,
wherein N is determined by a confidence interval cl, a tolerable variance error ev, wherein 12ev=100 f (cl)2N,wherein f(cl) is the (1+cl/100)/2-quantile of the unit normal distribution.
- 37. The computer system module in claim 35,
wherein the controller module is operative to report the weighted running average {overscore (d)}iN where iN is a multiple of N, i is an integer; and
to report {overscore (d)}n when the |{overscore (d)}n−{overscore (d)}iN| is greater than dd, wherein dd is a real number.
- 38. The computer system module in claim 37, wherein dd is σn.
- 39. The computer system module in claim 34,
wherein the w=c/n, wherein c is a real number, n is the n'th sampling.
- 40. The computer system module in claim 26,
wherein the controller module is operative to sample a second system metric and to obtain a sampled value of the second system metric; to calculate the correlation coefficient cc between the sampled value of the first system metric and the second system metric after M sampling; to stop sampling and stop reporting the sampled value of the second system metric if |cc| is not less than a threshold; and to continue sampling and reporting the sampled value of the second system metric if |cc| is less than a threshold, wherein |cc| is the absolute value of correlation coefficient cc.
- 41. The computer system module in claim 40,
wherein the threshold is 0.7.
- 42. The computer system module in claim 26, further comprising,
monitoring module operative to receive the reported sampled value of the first metric and to assume the sampled value of the first metric as an average for the sampled value not reported.
- 43. The computer system module in claim 41, further comprising:
a display module operative to display the received and assumed values of the first metric.
- 44. A computer system module for system performance monitoring, reporting and analysis, comprising:
a controller module operative to control the system performance monitoring; a sampling module coupled to the controller module, operative to sample at least a first and a second system metrics and obtaining sampled values of the first and second metrics; a reporting module coupled to the sampling module; wherein the controller module is operative to calculate the correlation coefficient cc between the sampled value of the first system metric and the second system metric after M sampling, wherein M is an integer;
to stop sampling and to stop reporting the sampled value of the second system metric if |cc| is not less than a threshold; and to continue sampling and reporting the sampled value of the second system metric if |cc| is less than a threshold, wherein |cc| is the absolute value of correlation coefficient cc.
- 45. The computer system module in claim 44, wherein the threshold is 0.7.
- 46. A computer network system comprising:
a plurality of network nodes having
a CPU; a memory module coupled to CPU, operative to contain computer executable programs; a network interface operative to interconnect different nodes of the network, wherein one computer executable program is loaded in the memory module in one node, wherein the computer executable program is operative to perform the method in any one claims 1-25.
- 47. A machine readable medium comprising a machine executable program, wherein the machine executable program is operative to perform the method in any one claims 1-25.
RELATED APPLICATIONS
[0001] This application claims priority to a provisional patent application by the same inventors, entitled: “Statistical Performance Monitoring,” Serial No. 60/419,175, filed on Oct. 17, 2002.
[0002] This application is related to an application by the same inventors, entitled: “Enterprise Management System and Method which Includes Statistical Recreation of System Resource Usage for More Accurate Monitoring, Predication and Performance Workload Characterization,” Ser. No. 09/287,601, filed on Apr. 7, 1999, Attorney docket No. 149-0025US.
[0003] Both of the above applications are incorporated herein by reference.
Provisional Applications (1)
|
Number |
Date |
Country |
|
60419175 |
Oct 2002 |
US |